The Atlantic hurricane season that opened on June 1 looks unusual not for the storms it is expected to bring, but for who—or what—is now helping to forecast them. For the first time, machine-learning weather models that spent the 2025 season in an experimental lane have been promoted to the operational toolkit that professional forecasters lean on when a system spins up in the tropics. The change is quiet, but it marks a turning point in how one of science's hardest prediction problems gets solved.
According to a pre-season outlook compiled by the insurance broker Howden, the U.S. National Weather Service has moved AI-based forecasting from experiment to full deployment for 2026. The headline example is Google DeepMind's WeatherNext 2, a global model that can render forecasts at hourly resolution and spin out hundreds of possible weather scenarios from a single starting point. Each scenario takes under a minute on a single tensor processing unit—work that a conventional physics-based model would hand to a supercomputer for hours. The system has already been threaded into Google Search, Gemini, Pixel Weather and the Maps Platform weather feeds, putting probabilistic forecasts in front of ordinary phone users without anyone noticing the machinery underneath.
The backdrop is a relatively calm official outlook. NOAA's seasonal forecast, released June 1, leans toward a below-normal year: a 55 percent chance of below-normal activity, with 8 to 14 named storms expected, 3 to 6 of them reaching hurricane strength and 1 to 3 becoming major hurricanes. You can read the full breakdown in NOAA's announcement. A quieter season, forecasters note, is exactly the right moment to bed in new tools before a high-stakes storm puts them to the test.
Why It Matters
For most of the modern era, weather prediction has been a contest of raw computing power. Numerical weather prediction works by chopping the atmosphere into a grid and solving the physics of fluid motion forward in time, a task so demanding that national meteorological agencies were among the first customers for supercomputers. That cost acted as a gatekeeper. Only institutions with nine-figure budgets could run the best models, and ensemble forecasts—the dozens of slightly different runs that reveal how confident a prediction really is—were rationed accordingly.
AI changes the arithmetic. A model like WeatherNext 2 learns the statistical patterns of the atmosphere from decades of historical data, then reproduces a forecast in a fraction of the time and energy. The practical effect is that running a large ensemble stops being a luxury. When hundreds of scenarios can be generated in minutes, forecasters can see the full spread of possibilities for where a storm might track and how strong it might get, which is precisely the information emergency managers need when they decide whether to evacuate a coastline.
The shift is not confined to one company. The European Centre for Medium-Range Weather Forecasts now reports that its own AI model beats its physics-based system by up to 20 percent on certain variables, and independent benchmarks show models such as GraphCast, FourCastNet and newer entrants matching or surpassing the long-dominant European system on many targets. A capability that was a research curiosity three years ago has become a competitive field, and the people who benefit most are the ones downstream: insurers pricing risk, grid operators bracing for wind, and coastal residents reading a weather app.
Reaction
The forecasting community's response has been enthusiasm tempered with hard-won caution. The most cited evidence came from the 2025 season, when WeatherNext flagged a Category 5 hurricane roughly five days out with about 80 percent confidence. The National Hurricane Center's own verification found that the AI model slightly outperformed its official track forecasts in the 12-to-72-hour range and matched its skill on intensity across three Category 5 storms—a striking result for a tool that, unlike the NHC's seasoned human forecasters, has no physical understanding of a hurricane at all.
Yet meteorologists are careful not to oversell it. A study published in Science Advances found that physics-based models still outperform AI forecasts when it comes to record-breaking extremes—the once-in-a-generation events that, by definition, are underrepresented in the historical data an AI learns from. Because these models predict by analogy to the past, a storm with no precedent is exactly where they are most likely to stumble. That is an uncomfortable limitation for hurricanes, where the rare monster storm is the one that matters most.
This is why the National Weather Service is deploying AI alongside its traditional systems rather than in place of them. The emerging consensus treats the machine-learning models as a fast, cheap second opinion—superb at painting the probable picture quickly, but still anchored by physics-based guidance and human judgment when a forecast carries lives in the balance. The humility is the point: a tool that is right most of the time is only trustworthy if everyone using it understands when it might be wrong.
What's Next
The AI making its debut this season is not limited to the headline forecasting models. NOAA is weaving machine learning deeper into the unglamorous plumbing of hurricane observation. At its Atlantic Oceanographic and Meteorological Laboratory, researchers are using ML to quality-control the data streaming from the tail Doppler radar mounted on the "Hurricane Hunter" aircraft that fly straight through storms. The new method salvages more than 25 percent additional usable data from each flight, sharpening the picture of a storm's structure and winds that forecasters rely on.
For the first time, the 2026 season will also feed data from small uncrewed aircraft—drones launched into the heart of a hurricane—directly into NOAA's Hurricane Analysis and Forecast System. Early work suggests that incorporating these low-altitude readings can improve intensity forecasts by around 10 percent, addressing the single hardest problem in tropical meteorology: not where a storm will go, but how strong it will be when it arrives. Intensity forecasting has lagged track forecasting for decades, and rapid intensification near landfall remains the nightmare scenario for coastal communities.
Taken together, the season ahead is less a single breakthrough than a convergence: AI models for speed and breadth, drones and radar for richer observations, and human forecasters to weigh it all. If the technology proves itself through a full season—ideally a calm one—the case for trusting it during the next catastrophic storm grows considerably stronger.
Closing Thoughts
There is something fitting about the atmosphere being the proving ground for applied AI. Weather is the original chaotic system, the canonical example of how tiny uncertainties cascade into enormous ones, and the field that gave us the phrase "the butterfly effect." That a pattern-matching model trained on the past can now anticipate a Category 5 storm days in advance is a genuine achievement, and a reminder of how much structure hides inside apparent chaos.
But the deployment also models a healthier relationship with the technology than the hype cycle usually allows. Forecasters are adopting AI not because it is infallible, but because it is useful in specific, measurable ways—and they are keeping the older tools close precisely because they know where the new ones break. The most consequential uses of artificial intelligence may turn out to look like this: not a dramatic replacement of human expertise, but a quiet new instrument added to a crowded toolkit, earning trust one season at a time. The storms of 2026 will deliver the first real verdict.
한글 요약
6월 1일 시작된 2026년 대서양 허리케인 시즌부터 미국 국립기상청(NWS)이 인공지능 기반 예보 모델을 실험 단계에서 정식 운영 단계로 전환했습니다. 대표적인 사례는 구글 딥마인드의 WeatherNext 2로, 단일 TPU에서 1분 안에 수백 개의 기상 시나리오를 생성합니다. 기존 물리 기반 수치예보가 슈퍼컴퓨터로 몇 시간씩 계산하던 작업을 대폭 빠르고 저렴하게 처리하면서, 앙상블 예보가 더 이상 일부 기관만의 사치가 아니게 되었습니다. NOAA는 올해를 평년 이하(명명 폭풍 8~14개)로 전망했는데, 비교적 조용한 시즌은 새 도구를 검증하기에 적절한 시점으로 평가됩니다.
2025년 시즌에 WeatherNext가 5등급 허리케인을 약 5일 전에 80% 신뢰폄로 포착했고, 국립허리케인센터(NHC) 검증에서 12~72시간 진로 예보가 공식 예보를 소폭 앞섰다는 결과가 신뢰의 근거가 되었습니다. 다만 과학자들은 신중합니다. Science Advances 연구에 따르면 기록적인 극한 현상에서는 여전히 물리 기반 모델이 더 정확한데, AI는 과거 데이터에 없던 전례 없는 사건에 취약하기 때문입니다. 그래서 NWS는 AI를 기존 시스템을 대체하는 것이 아니라, 빠르고 저렴한 '두 번째 의견'으로 병행 활용하고 있습니다.
올해 시즌에는 예보 모델뿐 아닌라 관측 인프라에도 AI가 깊숙이 들어갑니다. NOAA는 '허리케인 헌터' 항공기의 도플러 레이더 데이터를 머신러닝으로 정제해 활용한 데이터를 25% 이상 늘렴고, 처음으로 소형 무인기(드론) 데이터를 허리케인 예보 시스템에 직접 투입해 강도 예보 정확도를 약 10% 개선할 것으로 기대합니다. 결국 올해의 변화는 단일 혁신이라기보다, AI의 속도와 폭, 드론·레이더의 풍부한 관측, 그리고 인간 예보관의 판단이 한데 모인 수렴에 가깝습니다. 한 시즌을 무사히 검증한다면, 다음 대형 폭풍 때 이 기신을 신뢰할 근거는 한층 더 단단해질 것입니다.
참고 자료: NOAA 2026 시즌 전망 · Google DeepMind WeatherNext 2 · Science Advances