March 30, 202611 min read

Machine learning in CAD: beyond the hype

Machine learning has been in CAD for longer than the marketing suggests. Feature recognition, mesh cleanup, and constraint solving all used ML before 'AI' became a line item in every vendor's pitch deck.

Quick answer

Machine learning has been used in CAD for years: feature recognition in CAM, mesh repair algorithms, and constraint solving optimization. Recent additions include generative design, text-to-CAD, AI assistants, and natural language commands. The most impactful ML applications in CAD are still the boring ones: classification, search, and defect detection, not geometry generation.

Machine learning has been embedded in CAD software for longer than the marketing departments want you to realize. The most impactful ML applications in CAD are still the boring ones: feature recognition in CAM, part classification for search, mesh repair, and defect detection. Not geometry generation. I know this because I watched a CAM software correctly identify a pocket feature in my model last week, route the toolpath without my input, and save me about ten minutes of manual programming. Nobody called that AI. Nobody put it in a press release. It just worked, the way ML in CAD has been quietly working for years before "AI" became a line item on every vendor's pitch deck.

The recent wave of AI announcements, text-to-CAD, copilots, natural language commands, gets all the attention. And some of it is genuinely new. But understanding the full history of machine learning in CAD gives you better context for evaluating what's hype, what's incremental, and what might actually matter. The answer, as usual, is less exciting and more useful than the keynote version.

The quiet history: ML in CAD before anyone called it AI#

Feature recognition in CAM has used machine learning techniques for over a decade. When your CAM software looks at a 3D model and identifies holes, pockets, slots, and bosses without you explicitly labeling them, that's pattern recognition. Early implementations used rule-based systems, but the better ones moved to statistical learning approaches years ago. Mastercam, Fusion 360's manufacturing workspace, and several other CAM tools use trained classifiers to recognize machining features and suggest operations. This is ML. It's just not new enough to put on a slide.

Mesh cleanup and repair is another area where ML arrived quietly. When you import a mesh from a 3D scan or a mesh-based tool and the software automatically identifies and fixes gaps, overlapping triangles, and non-manifold edges, there's often a trained model underneath doing the classification. "Is this gap an error or intentional geometry?" is exactly the kind of ambiguous classification problem that ML handles well. Tools like Materialise Magics and Artec Studio have been using ML-assisted repair for years.

Constraint solving optimization, the math that figures out how to satisfy your sketch constraints in real time, has benefited from ML approaches too. When SolidWorks or Fusion 360 solves a fully constrained sketch instantly, part of the efficiency comes from heuristics that learned good solving strategies from millions of constraint patterns. This is the kind of thing nobody notices because it just makes the software responsive. The moment you notice a constraint solver is when it fails, which is a different kind of learning.

Parts classification and search in PLM systems started using ML-based similarity matching before anyone was talking about AI in CAD. Siemens' Geolus shape search, which can find parts similar to a given 3D shape in a database, uses geometric feature extraction and similarity learning. It's been available since the mid-2010s. When a company with a million parts in their PDM system needs to find an existing bracket that's close to what they need instead of designing a new one, that search saves real engineering hours. It's some of the most commercially valuable ML in CAD, and it's been around longer than most people realize.

The recent wave: what's actually new#

Starting around 2021 with the DeepCAD dataset and accelerating through 2024-2026, a genuinely new set of ML applications arrived in CAD. These are the ones getting the conference talks and the funding rounds.

Generative design uses ML-assisted topology optimization to explore design spaces under constraints. You define loads, materials, manufacturing methods, and performance targets. The software generates shapes that satisfy those constraints, often producing organic-looking geometries that no human would have drawn. Autodesk has had this in Fusion 360 for years. PTC has it in Creo GTO. This is the most mature "new wave" ML feature in CAD, and it works genuinely well for its specific use case: structural optimization.

Text-to-CAD is the flashiest new application. You describe a part in natural language, and an ML model generates parametric CAD geometry. The text-to-CAD guide covers the full landscape. The key ML innovation here came from treating CAD models as sequences of operations (sketch, extrude, fillet) and training transformer architectures to predict those sequences from text, similar to how language models predict the next word. The Text2CAD paper from NeurIPS 2024 formalized this approach, and it's now the foundation for several commercial tools.

AI assistants and copilots are the third new category. Onshape's AI Advisor, SolidWorks' AURA and LEO, Creo AI Assistant, Solid Edge Design Copilot, and Autodesk Assistant all use large language models trained on CAD documentation and user interactions. They answer questions, suggest operations, diagnose errors, and in some cases execute commands from natural language input. The ML here is the language model itself. The CAD-specific part is the training data and the integration layer.

Natural language command execution, like Fusion 360's Text to Command concept, uses language models to translate spoken or typed descriptions into CAD operations. "Extrude this face by 15 mm" gets mapped to the specific API call in the CAD tool. This requires understanding both the user's intent and the software's command structure, which is a natural language understanding problem that LLMs handle reasonably well for well-defined operation sets.

What actually works well: the boring stuff#

If I ranked ML applications in CAD by actual impact on daily work, the order would look nothing like the ranking by conference attention.

At the top: feature recognition in CAM. Saves time on every manufactured part. Reliable. Mature. Boring.

Second: parts search and classification. Saves time every time someone needs to find an existing part instead of designing a new one. Most useful in large organizations with big part libraries. Invisible to anyone who doesn't manage a PLM system.

Third: defect detection in manufacturing. ML models trained on inspection data can identify surface defects, dimensional outliers, and process deviations faster and more consistently than manual inspection. This is more manufacturing than CAD, but it closes the loop: the model predicts what the part should look like, the inspection system checks what it actually looks like, and the ML classifier flags the gap. Companies doing high-volume production have been using this for several years.

Fourth: mesh repair and import cleanup. Saves time every time you import geometry from an external source, which for anyone doing cross-platform work is constantly. Not glamorous. Genuinely useful.

Fifth: generative design. Powerful for specific structural optimization problems. Not broadly applicable. Most engineers don't do topology optimization regularly. Those who do find it valuable.

Text-to-CAD and AI assistants rank lower on this list not because they're unimportant but because they're early. The impact today is small. The potential impact is large. But potential doesn't make parts.

What doesn't work well yet#

Geometry generation from ML models is the most prominent weak spot. The how text-to-CAD works post covers the technical architecture. The short version: current models can generate simple parametric geometry from text descriptions, but the output is unreliable for anything beyond basic prismatic parts. The dimensions are approximate. The features are sometimes wrong. The manufacturing context is absent. The ML model learned what parts look like but not why they look that way.

Parametric prediction, where an ML model predicts not just a shape but a full parametric feature tree with proper constraints and design intent, is an active research area that hasn't produced reliable commercial results. The closest is Autodesk's Neural CAD concept, which aims to generate editable feature trees, but it's still in development. The fundamental problem is that design intent, the reason behind each feature and constraint, isn't encoded in most training data. The model sees the result but not the reasoning.

Assembly-level ML is almost nonexistent. Understanding how parts relate to each other, predicting interference, suggesting mating strategies, optimizing assembly sequences, these are all tasks that would benefit enormously from ML and that nobody has cracked at scale. Assembly data is scarce, complex, and deeply contextual. Two parts might be 10 mm apart because of a thermal expansion requirement, or because that's where the fastener access is, or because the designer forgot to update the constraint after the last revision. An ML model trained on assembly geometry alone can't distinguish between these reasons.

DFM-aware generation is the big missing piece. Training an ML model to generate geometry that can actually be manufactured requires training data that includes manufacturing context: process parameters, tooling constraints, material properties, tolerance requirements. That data is almost entirely proprietary. The CAD dataset problem is the root cause of most of text-to-CAD's current limitations.

The training data problem#

Every ML application in CAD is limited by its training data, and CAD training data is uniquely scarce.

Image AI was trained on billions of images scraped from the internet. Language AI was trained on trillions of tokens from public text. CAD AI is trained on datasets in the low hundreds of thousands, mostly simple geometry, mostly missing the metadata that would make the models useful for engineering.

The DeepCAD dataset has about 178,000 models. That's the primary training set for text-to-CAD research. 178,000 sounds like a lot until you compare it to the billions of data points in other AI domains. And those 178,000 models are simple: sketch-and-extrude operations producing basic prismatic parts. No sweeps, no lofts, no sheet metal, no assemblies.

The Fusion 360 Gallery has about 8,000 models with design history. ShapeNet has around 51,000 3D models, mostly meshes. The ABC dataset has over a million models but without text annotations or manufacturing metadata.

Meanwhile, the really useful CAD data, parts designed for real products with real tolerances and real manufacturing constraints, sits inside corporate PDM systems behind firewalls. Companies don't share this data because it's proprietary, because it contains trade secrets, and because nobody has built a standard way to anonymize and contribute CAD models at scale.

This data gap is the single biggest constraint on ML in CAD. The models can only learn what they're shown. If they're shown 178,000 simple extrusions, they learn to generate simple extrusions. The ceiling won't move until the data does.

Where to watch for real progress#

Not all areas of ML in CAD are advancing at the same rate. Here's where I think the most meaningful progress is likely in the next two to three years.

DFM validation layers. Not AI that generates manufacturable geometry from scratch, but AI that checks existing geometry against manufacturing rules and flags problems. This is a classification problem, which ML handles well, and the training data (known manufacturing failures, common DFM violations) is more available than generative training data. Several startups and at least two major vendors are working on this.

Improved feature recognition and automatic machining strategy selection. CAM feature recognition has been good for simple features for years. Extending it to complex, multi-setup parts with compound geometry is harder, and ML models with more diverse training data will improve it. This is incremental progress on an existing strength, which is the kind of improvement that actually ships.

Better text-to-CAD for simple parts. The current tools will get more accurate, more dimensionally reliable, and better at handling a wider range of basic geometry. This won't happen through architecture breakthroughs alone. It'll happen through better and larger training datasets, which is a data engineering problem more than an ML research problem.

ML-assisted tolerance analysis. Given a model with nominal geometry, suggesting appropriate tolerances based on similar parts, common fit requirements, and manufacturing process capabilities. This would be enormously useful, and the training data exists inside companies' quality and inspection records, but nobody has assembled it at scale yet.

The pattern: the areas likely to see real progress are the ones where ML is extending existing capabilities (search, classification, recognition) or where the training data problem is solvable (DFM rules, tolerance standards). The areas that remain hard are the ones that require data nobody has or reasoning nobody has modeled.

The honest assessment#

Machine learning has been making CAD software better for years, mostly in ways you never notice. The feature recognition that saves you time in CAM, the search that finds a similar part in your PLM system, the import repair that fixes a broken mesh from a 3D scan, these are all ML applications that work, that ship, and that nobody writes conference papers about anymore because they're just part of the software.

The new wave of ML in CAD, the text-to-CAD generators and the AI copilots, is more visible, more hyped, and less mature. It will get better. The research trajectory is clear. But the gap between "interesting research" and "reliable production tool" is the same gap it has always been in CAD: filled with manufacturing constraints, edge cases, and the accumulated judgment of people who have watched parts come back wrong.

If you want to benefit from ML in CAD today, use the boring stuff. Turn on feature recognition in your CAM workflow. Use your PLM system's search features. Let the import repair tools do their work. These are the ML applications that have earned trust through years of shipping, and they'll save you more time this week than any text-to-CAD demo.

For the new stuff, experiment. Try the text-to-CAD tools. Test the vendor copilots. See what works for your specific tasks. But test the output. Measure the parts. Don't trust the preview. The marketing says "AI is transforming CAD." The reality is more like "ML is making some of the tedious parts slightly less tedious, and also there's a chatbot now." Less exciting. More honest. About what I'd expect from a technology that's been quietly useful for years and only recently learned to generate press releases.

Newsletter

New articles, product updates, and practical ideas on Text-to-CAD, AI CAD, and CAD workflows.