Collaborative development across forecasting, diagnostics, and operational interface design.
Adaptive Multi-Scale AI for India
Weather and monsoon extremes forecasting in India with an indigenous multi-institutional AI framework.
BharatCast benchmarks global AI weather systems, trains indigenous spatio-temporal models on Indian datasets, refines forecasts across scales, and supports early warning for heavy rainfall, cyclones, and hydro-meteorological extremes.
How BharatCast Works
Compact overview of the current WeatherBench2 diffusion prototype
BharatCast currently uses a curated WeatherBench2 ERA5 subset, a patch-based conditional diffusion model, and global overlap-add stitching to generate the GT-vs-Pred videos shown below.
Data
WeatherBench2 subset
We start from a carefully selected WeatherBench2 ERA5 archive covering many years of global weather at regular 6-hour intervals.
- It includes the main weather signals needed for forecasting, such as temperature, wind, pressure, humidity, and rainfall.
- Upper-air information is retained across several standard pressure levels so the model can see how the atmosphere evolves vertically.
- The result is a compact but meteorologically meaningful training set designed for efficient experimentation.
Formulation
Increment-based conditional diffusion
Instead of redrawing the entire future atmosphere from scratch, the model learns the next change that should happen from the current weather state.
- Each training example asks the model to predict how the atmosphere should change over the next 24 hours.
- The latest observed field acts as the anchor, so forecasts stay tied to the real atmospheric state.
- This makes the prediction task more stable because the model focuses on what is changing, not on relearning what is already present.
Training
Patch cache and UNet setup
The global fields are broken into smaller overlapping regions so the model can learn detailed local structure without losing atmospheric context.
- We train on a large collection of overlapping weather patches drawn from the global archive.
- A multi-scale UNet learns how a noisy candidate update should be cleaned into a physically consistent weather change.
- The current prototype is a large-capacity model, with roughly 99 million parameters, built to capture both broad circulation and local detail.
Inference + References
Global stitching and related systems
At forecast time, the model produces many local updates, and those updates are blended back together to recover a smooth full-globe prediction.
- Overlapping patch predictions are softly combined so the final map looks continuous instead of blocky.
- The resulting fields are rendered as side-by-side global videos for qualitative comparison against ground truth.
- We benchmark this prototype against modern weather-AI systems and diffusion-based generative modeling ideas.
Primary Forecast View
2m Temperature Evolution
Current Model Note
Current forecast media is produced from a patch-based diffusion model trained on a
WeatherBench2 subset stored at data/interim/subsets/era5_1p5deg_6h_2000-2018_minvars.zarr.
The training cache uses 64×64 patches with 32-cell stride, pred_len=4
corresponding to a +24h target at 6-hour resolution, and 30 flattened channels
derived from 10 core variables / variable groups.
The saved BharatCast checkpoint contains approximately 99.1 million parameters with an
EMA weight footprint of about 378 MB, while the full best.pt training checkpoint
on disk is approximately 1.54 GB. The current training run configuration uses
batch_size=8, lr=2e-4, and diffusion noise settings
sigma_min=0.01, sigma_max=1.0.
This model run was trained on an NVIDIA RTX 6000 Ada Generation 48 GB class GPU setup for local experimentation and qualitative validation before larger-scale deployment on shared infrastructure.
Research Dashboard
Work packages
Key Diagnostics
Evidence package
Media Lab
Prototype validation media
Operational Deployment Core
Current gaps before operationalization
- Benchmark GraphCast, Pangu-Weather, and GenCast on Indian datasets with regional bias reporting
- Validate BharatCast against NCMRWF data across monsoon and non-monsoon regimes
- Integrate MeitY GPU-backed inference, data pipelines, and API delivery for open-access forecasts
- Extend 0.25° forecasts toward 0.125° and 0.0625° downscaled products
- Add extreme-event modules for heavy rainfall, cyclones, and flood-oriented risk indicators
Development Roadmap
Year-wise platform milestones
Year 1: curate harmonized NWP, reanalysis, satellite, and observational datasets and deliver validated baseline benchmarking reports.
Year 2: develop BharatCast forecasting, downscaling, and extremes modules into a functional AI system validated on NCMRWF data.
Year 3: deploy the end-to-end open web platform with APIs, dashboards, and operational forecast delivery on MeitY infrastructure.
Reference Resources
Datasets, and Operational Links
Curated benchmark datasets, India-focused data portals, and operational weather links relevant for training, validating, and deploying climate and weather AI systems.
WeatherBench 2
Open benchmark suite for data-driven global weather forecasting, widely used to compare deterministic and probabilistic forecast skill.
SEVIR
Large-scale severe-weather event dataset combining radar, satellite, lightning, and storm-event information for convective nowcasting.
HKO-7 Radar Echo Dataset
Canonical precipitation nowcasting benchmark built from Hong Kong Observatory radar echo sequences for short-range spatio-temporal forecasting.
Copernicus Climate Data Store
Official portal for ERA5 and related reanalysis, climate, and atmospheric datasets used for training, benchmarking, and verification workflows.
IMD Open Data and Products
India Meteorological Department operational portal for forecasts, warnings, rainfall products, observations, and public weather bulletins.
AIKosh
IndiaAI data repository for discoverable AI-ready datasets and future India-specific foundation-model training sources.
MOSDAC
Meteorological and Oceanographic Satellite Data Archival Centre for INSAT and related satellite products relevant for convection and monsoon analysis.
NCMRWF
Operational forecasting partner for numerical weather prediction, diagnostics, forecast products, and validation workflows.
Research Papers
Foundation Models, Forecasting Papers & Platforms
Key papers and platform references spanning graph models, neural operators, diffusion models, and large-scale weather AI systems that inform the BharatCast prototype.
GraphCast: Learning skillful medium-range global weather forecasting
Authors: Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, et al.
Graph neural network weather model that set a strong benchmark for modern AI-based medium-range global forecasting.
GenCast: Diffusion-based ensemble forecasting for medium-range weather
Authors: Ilan Price, Matthew Willson, Jonathan Horsfield, Richard Turner, et al.
Diffusion-based ensemble weather forecasting paper directly relevant to probabilistic atmospheric prediction.
FourCastNet
Authors: Jaideep Pathak, Shashank Subramanian, Ashesh Chattopadhyay, et al.
High-resolution global weather model using adaptive Fourier neural operators.
Authors: Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole
Core score-based SDE paper behind continuous-time diffusion modeling and conditional denoising formulations.
Authors: Jonathan Ho, Ajay Jain, Pieter Abbeel
Foundational diffusion-model paper behind modern denoising-based generative learning.
Fourier Neural Operator
Authors: Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, et al.
One of the core neural-operator papers underlying much of modern climate and weather surrogate modeling.
Earth-2
Authors: NVIDIA
Climate and weather digital-twin platform relevant as a systems-level reference for scalable AI weather infrastructure.
Earth2Studio
Authors: NVIDIA
Tooling and workflows for Earth-2 and AI weather/climate experimentation.
Collaborative Project
Developed as a joint effort across academic institutes and operational forecasting partners.