January 23, 20269 min read

What is text-to-CAD, and what it isn't

Text-to-CAD turns a typed description into actual editable CAD geometry. It's not text-to-3D, it's not generative design, and it's not magic. Here's what it really is.

Quick answer

Text-to-CAD is AI that converts natural language prompts into editable B-Rep CAD models (STEP, SCAD, or native CAD files), not meshes. It produces real parametric geometry you can fillet, chamfer, and dimension, unlike text-to-3D tools that output amorphous mesh blobs.

The first time I tried text-to-CAD, I was sitting at my desk with half a lunch break left and a bracket I didn't feel like modeling by hand. I typed something like "L-bracket with two mounting holes, 3mm thick, 40mm legs" into Zoo's text-to-CAD tool, hit enter, and watched the screen think for about fifteen seconds. What came back was an actual L-bracket. Not a render. Not a concept sketch. An actual solid body with fillets, holes, and a STEP file I could open in Fusion 360. I rotated it, measured it, selected a face, and it behaved like real geometry. I also noticed the holes were slightly too close to the edge for any machinist who values their end mills, but that's a separate conversation.

Text-to-CAD is AI that takes a written description and produces editable CAD geometry from it. Not a mesh. Not a point cloud. Not a pretty picture. Actual B-Rep solid models you can open in real CAD software, dimension, modify, and send to manufacturing.

That one sentence is the whole idea, and also the whole reason it matters.

B-Rep vs mesh, and why you should care#

If you've spent any time in CAD, you already know the difference between a solid body and a mesh, even if you don't think about it in those terms. A solid body in SolidWorks or Fusion 360 has faces, edges, and vertices that the software understands as geometric entities. You can select a face and extrude it. You can fillet an edge. You can add a hole with a specific diameter and the software knows it's a hole, not just a collection of triangles that happen to form a circle-shaped opening.

That's B-Rep, or Boundary Representation. It's how professional CAD has worked for decades. The geometry is defined by mathematical surfaces and their boundaries, not by a skin of tiny triangles approximating a shape.

A mesh is the other thing. It's what you get from text-to-3D tools, from 3D scanning, from game engines, and from most of the AI-generated 3D content that's been making the rounds on social media. A mesh is a bag of triangles. It can look like a bracket, but it doesn't know it's a bracket. Try to fillet an edge on a mesh import in SolidWorks and you'll get a look from the software that roughly translates to "I don't know what you're talking about."

This is the single most important distinction in the text-to-CAD conversation. When a text-to-CAD tool generates a B-Rep model, you get something you can work with in an engineering context. When a text-to-3D tool generates a mesh, you get something that looks nice in a viewport and becomes a problem the moment you need to do anything useful with it. I've imported enough STL files from various "AI 3D generators" to know how that afternoon goes. You spend more time converting the mesh into a solid than it would have taken to model the part from scratch.

A STEP file from a text-to-CAD tool opens in your CAD software as geometry with selectable faces and real edges. An OBJ file from a text-to-3D tool opens as one fused lump that the feature tree treats like a foreign object. If your work ends at "look at this cool shape," the mesh is fine. If your work continues into tolerancing, manufacturing, assembly, or any revision more sophisticated than rotating the camera, you need B-Rep. For a more detailed breakdown of why this matters, I wrote about it in text-to-CAD vs text-to-3D.

What text-to-CAD is not#

This part matters because the terms are getting mixed up constantly, and the confusion is not accidental. Vendors love blurry categories. Blurry categories let you claim adjacent territory without doing the actual work.

Text-to-CAD is not text-to-3D. Text-to-3D tools like Meshy, Tripo, and the various diffusion-model-based generators produce meshes for games, animation, and concept art. They are not trying to produce engineering geometry. They are not trying to produce files you can machine from. The output looks like a 3D object on screen, which is where the confusion starts, but the internal representation is completely different. Calling a mesh "CAD" is like calling a photograph of a blueprint "engineering documentation." It resembles the thing without being the thing.

Text-to-CAD is not generative design. Generative design, the kind Autodesk and Siemens have been shipping for years, starts with constraints and loads and uses topology optimization to propose organic-looking shapes. It's a different problem. Generative design asks "what shape best satisfies these forces?" Text-to-CAD asks "can you build me the thing I just described?" Generative design outputs tend to look like bones or coral. Text-to-CAD outputs tend to look like the parts a normal engineer would model. I covered the differences more thoroughly in text-to-CAD vs generative design.

Text-to-CAD is not a CAD copilot. The major vendors are all adding AI assistants to their existing tools. SolidWorks has AURA and LEO. Onshape has an AI Advisor. Siemens NX has a Design Copilot. Autodesk is working on an Assistant for Fusion. These operate inside an existing workflow, suggesting commands, answering questions, or automating repetitive tasks. They don't generate geometry from a blank prompt. They're more like a knowledgeable colleague looking over your shoulder than a tool that builds the first version of the part for you.

The differences matter because each of these approaches solves a different problem, and pretending they're all the same thing helps nobody except the people writing press releases.

How it actually works, briefly#

The technical details of how text-to-CAD works deserve their own post, but the short version is this: a text-to-CAD system takes your natural language input, interprets it as a sequence of CAD operations, and executes those operations to produce a solid model.

Some tools do this by generating code. Zoo's system, for example, uses a GPU-native geometric kernel called KittyCAD and can produce models through an API that generates geometry server-side. CADAgent, an open-source Fusion 360 add-in released in March 2026, uses an LLM to generate modeling commands that execute directly inside Fusion 360's environment. Other approaches generate OpenSCAD scripts, which are then compiled into geometry.

The academic foundation comes from work like the Text2CAD paper that got a spotlight at NeurIPS 2024. That research introduced an end-to-end framework using a transformer-based network to generate parametric CAD sequences from text, trained on roughly 660,000 annotations mapped to about 170,000 models from the DeepCAD dataset. The commercial tools build on these ideas, though most keep their exact architectures fairly quiet.

What all of these approaches share is the goal of producing geometry as a sequence of operations, not as a prediction of what a surface should look like. That operation-based approach is what makes the output editable. A chamfer generated as a CAD operation is a chamfer you can modify. A chamfer approximated by a mesh is just geometry that happens to look chamfered until you zoom in.

Where the technology actually stands#

I'll be honest about this part because I think the demos are running ahead of the reality, and that's a familiar pattern in CAD software.

As of early 2026, text-to-CAD works reasonably well for simple to moderately complex prismatic parts. Brackets, enclosures, simple housings, plates with hole patterns, standoffs, basic mechanical components. If you can describe it in a sentence or two and it doesn't involve complex surfacing, organic shapes, or multi-body assemblies, you have a decent shot at getting something useful back. The output usually needs cleanup. Dimensions might be close but not what you specified. Features might be placed approximately. The topology of the model might be suboptimal for later editing. But the starting point is often better than nothing, especially for prototyping or quick-iteration work.

For anything more complex, the technology gets shaky fast. Ask for a gear with specific module and tooth count, a snap-fit enclosure with proper draft angles, or a sheet metal part with bend reliefs, and you'll start seeing the limits. Multi-part assemblies are mostly out of reach. Tolerances are not handled meaningfully. And the tools have no sense of manufacturing process, so they'll happily generate geometry that looks plausible but would make a machinist reach for a chair to sit down in.

The dedicated tools like Zoo.dev, AdamCAD, and CADAgent are the most honest about what they can do. The major CAD vendors are adding AI features more cautiously, with Autodesk's Neural CAD and Dassault's AURA still largely in development or early rollout. PTC's Onshape AI Advisor is live but focuses on workflow assistance rather than geometry generation from scratch. For a rundown of what's available and how they compare, see best text-to-CAD tools.

Who this is actually useful for, right now#

If you're a professional engineer working on production parts with real tolerances, text-to-CAD is not replacing your workflow. Not yet. The output isn't precise enough, the feature trees aren't clean enough, and the lack of manufacturing awareness means you'd spend as much time fixing the output as you'd save.

Where it's genuinely useful today is in the early stages of design. Concept exploration, quick prototyping, generating starting geometry that you'll refine by hand, or producing simple parts for non-critical applications like 3D-printed fixtures and jigs. It's also useful for people who know what they want but don't have deep CAD skills. A hardware startup founder who needs a rough enclosure model to discuss with a contract engineer. A maker who wants a mounting bracket without learning Fusion 360 from scratch.

I think of it like a first draft. Nobody publishes a first draft, but a first draft is better than a blank page. Text-to-CAD gives you a first draft of geometry. What you do with it still depends on knowing what good geometry looks like.

Where this is going#

Text-to-CAD will get better. The training data is improving, the integration with existing CAD tools is getting tighter, and CADAgent's direct Fusion 360 plugin shows the direction of travel. Within a few years, I expect simple parts to be fairly reliable and moderate-complexity parts to work with human review.

But I don't think it replaces knowing how to use CAD any more than autocomplete replaces knowing how to write. The tool generates geometry. Understanding whether that geometry can be manufactured, whether the feature tree will survive the next revision, whether the tolerances make sense, that's still on you. And honestly, that's always been the hard part.

The bracket I generated on my lunch break? I used it as a starting point. Changed the hole positions, added a gusset the AI hadn't thought of, and exported a STEP file to a machinist who didn't know or care where the first version came from. Not magic. Not useless. Somewhere in between, which is where most useful tools live before the marketing catches up to them.

Newsletter

New articles, product updates, and practical ideas on Text-to-CAD, AI CAD, and CAD workflows.