datadoo
Back to blog
Company

What We Took Away from GTC 2026

We presented our research on synthetic windshield damage detection at GTC 2026. Here is what we learned, what we heard on the floor, and why Physical AI is moving faster than anyone expected.

DD
datadoo research
Mar 23, 20264 min read
What We Took Away from GTC 2026
Company

What We Took Away from GTC 2026

We were in San Jose from March 16 to 19 for GTC 2026, presenting our research poster, "Detecting Windshield Damage Using Physically Realistic Synthetic Data." The work shows how we generate fully synthetic training datasets for laminated-glass damage detection using NVIDIA Omniverse, Replicator, and Cosmos-Transfer, and train a segmentation model that issues repair-vs.-replace decisions for auto insurers without ever seeing a real image.

The research

Windshield damage accounts for roughly 30% of all auto-insurance claims in an industry worth $37 to $41 billion globally. Getting real training data for this domain is notoriously difficult: transparent, reflective surfaces break conventional capture and annotation workflows. Labels are ambiguous. Edge cases are expensive to reproduce.

Our pipeline sidesteps this entirely. We author scenes in USD, randomize illumination, weather, camera parameters, glass curvature, lamination layers, and damage taxonomies procedurally, and render with RTX Real-Time 2.0. NVIDIA Replicator produces RGB, depth, normals, instance masks, and metadata for automatic labeling. Cosmos-Transfer then acts as a domain-control layer, augmenting variability through natural-language conditioning so we can target specific environments and edge cases at scale.

The result: a semantic segmentation model trained entirely on synthetic data, with a reworked stage-randomization pipeline that achieves 40% faster generation while preserving quality, and a 1.4x speedup in development iteration cycles.

The model does more than classify damage. Its segmentation mask measures the extent and position of the damage on the windshield and produces a qualified repair-vs.-replace decision comparable to the output of a human valuator. That is the kind of output insurers can act on.

What we heard on the floor

What struck us most was how consistent the pain points were across domains. Teams working on autonomous driving, warehouse robotics, infrastructure inspection, and medical imaging all described variations of the same bottleneck: they have good models, they do not have enough good data.

"We cannot get edge cases." Safety-critical systems need to handle rare events, but rare events are, by definition, hard to capture. Synthetic generation lets you manufacture the long tail on demand.

"Labeling is killing us." Manual annotation is slow, expensive, and inconsistent. Auto-labeled synthetic data removes the bottleneck entirely.

"Privacy and compliance block us." Medical, insurance, and consumer-facing domains face strict constraints on real data. Synthetic data is privacy-safe by construction.

"We tried synthetic data and it did not transfer." This one matters. Not all synthetic data is created equal. If your renderer does not model physics correctly, you get a sim-to-real gap that no amount of domain randomization will close. This is exactly why we build on Omniverse with physics-first rendering rather than game-engine shortcuts.

GTC's big message: Physical AI is here

Jensen Huang's keynote made it clear that physical AI has moved from a research curiosity to a commercial priority. Autonomous vehicles, robotics platforms, and digital twins all require models that understand and operate in the physical world, not just the linguistic one.

That shift validates the core thesis behind datadoo: if your AI system needs to act in the real world, it needs training data that comes from the real world's physics. Cosmos-Transfer, which we already integrate into our pipeline, was featured prominently. So was the broader Omniverse ecosystem that we build on.

For us, GTC 2026 was a confirmation that the market is moving toward what we have been building for years. The conversations were real, the problems are urgent, and the appetite for production-grade synthetic data has never been higher.

What is next

We are following up with every team that engaged with us at the poster. Several of those conversations are already turning into pilot projects. If you are facing a data bottleneck in computer vision or physical AI, we would like to hear about it.

The poster is available on our site: datadoo.com/events/gtc2026. And if you want to talk, reach out: datadoo.com/contact

Share