BharatCast

Adaptive Multi-Scale AI for India

Weather and monsoon extremes forecasting in India with an indigenous multi-institutional AI framework.

BharatCast benchmarks global AI weather systems, trains indigenous spatio-temporal models on Indian datasets, refines forecasts across scales, and supports early warning for heavy rainfall, cyclones, and hydro-meteorological extremes.

Open Forecast Console View Work Packages

Program BharatCast Collaborative Initiative Current Focus Benchmarking, Modeling, Deployment

Core Benchmarking GraphCast, Pangu, GenCast measure bias and skill drop over Indian datasets

Primary Forecast Grid 0.25° India-focused AI regional and local-scale weather dynamics

Downscaling Goal 0.125° / 0.0625° diffusion and flow-based high-resolution refinement

Decision Support Heavy rainfall and cyclones improved lead time and regional risk awareness

How BharatCast Works

Compact overview of the current WeatherBench2 diffusion prototype

BharatCast currently uses a curated WeatherBench2 ERA5 subset, a patch-based conditional diffusion model, and global overlap-add stitching to generate the GT-vs-Pred videos shown below.

Data

WeatherBench2 subset

We start from a carefully selected WeatherBench2 ERA5 archive covering many years of global weather at regular 6-hour intervals.

It includes the main weather signals needed for forecasting, such as temperature, wind, pressure, humidity, and rainfall.
Upper-air information is retained across several standard pressure levels so the model can see how the atmosphere evolves vertically.
The result is a compact but meteorologically meaningful training set designed for efficient experimentation.

Formulation

Increment-based conditional diffusion

Instead of redrawing the entire future atmosphere from scratch, the model learns the next change that should happen from the current weather state.

Each training example asks the model to predict how the atmosphere should change over the next 24 hours.
The latest observed field acts as the anchor, so forecasts stay tied to the real atmospheric state.
This makes the prediction task more stable because the model focuses on what is changing, not on relearning what is already present.

Training

Patch cache and UNet setup

The global fields are broken into smaller overlapping regions so the model can learn detailed local structure without losing atmospheric context.

We train on a large collection of overlapping weather patches drawn from the global archive.
A multi-scale UNet learns how a noisy candidate update should be cleaned into a physically consistent weather change.
The current prototype is a large-capacity model, with roughly 99 million parameters, built to capture both broad circulation and local detail.

Inference + References

Global stitching and related systems

At forecast time, the model produces many local updates, and those updates are blended back together to recover a smooth full-globe prediction.

Overlapping patch predictions are softly combined so the final map looks continuous instead of blocky.
The resulting fields are rendered as side-by-side global videos for qualitative comparison against ground truth.
We benchmark this prototype against modern weather-AI systems and diffusion-based generative modeling ideas.

Forecast Console

Run configuration

Dataset Variable / Asset Horizon Region

Active mode Forecasting and Validation

Inference backend Current model on curated Indian weather data

Primary Forecast View

2m Temperature Evolution

WeatherBench2 Global forecast media

Current Model Note

Current forecast media is produced from a patch-based diffusion model trained on a WeatherBench2 subset stored at data/interim/subsets/era5_1p5deg_6h_2000-2018_minvars.zarr. The training cache uses 64×64 patches with 32-cell stride, pred_len=4 corresponding to a +24h target at 6-hour resolution, and 30 flattened channels derived from 10 core variables / variable groups.

The saved BharatCast checkpoint contains approximately 99.1 million parameters with an EMA weight footprint of about 378 MB, while the full best.pt training checkpoint on disk is approximately 1.54 GB. The current training run configuration uses batch_size=8, lr=2e-4, and diffusion noise settings sigma_min=0.01, sigma_max=1.0.

This model run was trained on an NVIDIA RTX 6000 Ada Generation 48 GB class GPU setup for local experimentation and qualitative validation before larger-scale deployment on shared infrastructure.

Research Dashboard

Work packages

O1 Data Curation + Benchmarking Benchmark global AI weather models on Indian data NCMRWF, satellite, ground, and reanalysis harmonization

O2-O4 BharatCast Modeling Indigenous multi-scale AI with extremes sensitivity GNNs, FNOs, diffusion, ensembles, and downscaling modules

O5 Operational Deployment Open-access web portal with APIs and dashboards MeitY GPU integration for research, operations, and decision support

Key Diagnostics

Evidence package

RMSE ACC CRPS comparison — Prototype forecast-skill benchmarking

Spread skill coverage — Uncertainty and calibration diagnostics

Rank and PIT histograms — Reliability evidence for probabilistic outputs

Power spectrum and SSIM — Spatial fidelity and structure preservation

Media Lab

Prototype validation media

BharatCast Prototype

India-relevant temperature forecasting evidence

Open MP4

BharatCast Prototype

Pressure evolution and large-scale structure

Open MP4

BharatCast Prototype

Precipitation and extremes validation

Open MP4

BharatCast Prototype

Wind and transport structure

Open MP4

Operational Deployment Core

Current gaps before operationalization

Benchmark GraphCast, Pangu-Weather, and GenCast on Indian datasets with regional bias reporting
Validate BharatCast against NCMRWF data across monsoon and non-monsoon regimes
Integrate MeitY GPU-backed inference, data pipelines, and API delivery for open-access forecasts
Extend 0.25° forecasts toward 0.125° and 0.0625° downscaled products
Add extreme-event modules for heavy rainfall, cyclones, and flood-oriented risk indicators

Development Roadmap

Year-wise platform milestones

Year 1: curate harmonized NWP, reanalysis, satellite, and observational datasets and deliver validated baseline benchmarking reports.

Year 2: develop BharatCast forecasting, downscaling, and extremes modules into a functional AI system validated on NCMRWF data.

Year 3: deploy the end-to-end open web platform with APIs, dashboards, and operational forecast delivery on MeitY infrastructure.

Reference Resources

Datasets, and Operational Links

Curated benchmark datasets, India-focused data portals, and operational weather links relevant for training, validating, and deploying climate and weather AI systems.

Benchmark Dataset

WeatherBench 2

Open benchmark suite for data-driven global weather forecasting, widely used to compare deterministic and probabilistic forecast skill.

Storm Dataset

SEVIR

Large-scale severe-weather event dataset combining radar, satellite, lightning, and storm-event information for convective nowcasting.

Radar Benchmark

HKO-7 Radar Echo Dataset

Canonical precipitation nowcasting benchmark built from Hong Kong Observatory radar echo sequences for short-range spatio-temporal forecasting.

Global Reanalysis

Copernicus Climate Data Store

Official portal for ERA5 and related reanalysis, climate, and atmospheric datasets used for training, benchmarking, and verification workflows.

India Operational

IMD Open Data and Products

India Meteorological Department operational portal for forecasts, warnings, rainfall products, observations, and public weather bulletins.

India Data Portal

AIKosh

IndiaAI data repository for discoverable AI-ready datasets and future India-specific foundation-model training sources.

India Satellite Data

MOSDAC

Meteorological and Oceanographic Satellite Data Archival Centre for INSAT and related satellite products relevant for convection and monsoon analysis.

Operational Partner

NCMRWF

Operational forecasting partner for numerical weather prediction, diagnostics, forecast products, and validation workflows.

Research Papers

Foundation Models, Forecasting Papers & Platforms

Key papers and platform references spanning graph models, neural operators, diffusion models, and large-scale weather AI systems that inform the BharatCast prototype.

Science / DeepMind

GraphCast: Learning skillful medium-range global weather forecasting

Authors: Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, et al.

Graph neural network weather model that set a strong benchmark for modern AI-based medium-range global forecasting.

Nature / DeepMind

GenCast: Diffusion-based ensemble forecasting for medium-range weather

Authors: Ilan Price, Matthew Willson, Jonathan Horsfield, Richard Turner, et al.

Diffusion-based ensemble weather forecasting paper directly relevant to probabilistic atmospheric prediction.

arXiv / NVIDIA

FourCastNet

Authors: Jaideep Pathak, Shashank Subramanian, Ashesh Chattopadhyay, et al.

High-resolution global weather model using adaptive Fourier neural operators.

arXiv

Score-Based Generative Modeling through Stochastic Differential Equations

Authors: Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole

Core score-based SDE paper behind continuous-time diffusion modeling and conditional denoising formulations.

arXiv

Denoising Diffusion Probabilistic Models

Authors: Jonathan Ho, Ajay Jain, Pieter Abbeel

Foundational diffusion-model paper behind modern denoising-based generative learning.

arXiv

Fourier Neural Operator

Authors: Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, et al.

One of the core neural-operator papers underlying much of modern climate and weather surrogate modeling.

NVIDIA Platform

Earth-2

Authors: NVIDIA

Climate and weather digital-twin platform relevant as a systems-level reference for scalable AI weather infrastructure.

NVIDIA Tooling

Earth2Studio

Authors: NVIDIA

Tooling and workflows for Earth-2 and AI weather/climate experimentation.

Collaborative Project

Developed as a joint effort across academic institutes and operational forecasting partners.

IIT BHU

NCMRWF

IIT Roorkee

IIT Ropar

Incremental conditional diffusion for weather forecasting.

India-focused AI forecasting platform integrating benchmarking, modeling, downscaling, and operational delivery.

Weather and monsoon extremes forecasting in India with an indigenous multi-institutional AI framework.

Compact overview of the current WeatherBench2 diffusion prototype

WeatherBench2 subset

Increment-based conditional diffusion

Patch cache and UNet setup

Global stitching and related systems

2m Temperature Evolution

Work packages

Evidence package

Prototype validation media

India-relevant temperature forecasting evidence

Pressure evolution and large-scale structure

Precipitation and extremes validation

Wind and transport structure

Current gaps before operationalization

Year-wise platform milestones

Datasets, and Operational Links

WeatherBench 2

SEVIR

HKO-7 Radar Echo Dataset

Copernicus Climate Data Store

IMD Open Data and Products

AIKosh

MOSDAC

NCMRWF

Foundation Models, Forecasting Papers & Platforms

GraphCast: Learning skillful medium-range global weather forecasting

GenCast: Diffusion-based ensemble forecasting for medium-range weather

FourCastNet

Score-Based Generative Modeling through Stochastic Differential Equations

Denoising Diffusion Probabilistic Models

Fourier Neural Operator

Earth-2

Earth2Studio

Developed as a joint effort across academic institutes and operational forecasting partners.