10 min read

Text-to-CAD API with Python: a developer walkthrough

A practical walkthrough of calling text-to-CAD APIs from Python, including the Zoo/KittyCAD SDK, handling responses, and saving STEP files.

Quick answer

To use text-to-CAD from Python: install the kittycad package, authenticate with an API token, call the text-to-CAD endpoint with your prompt, poll for completion, and save the resulting STEP file. The SDK handles auth, polling, and file format conversion.

Last week I had a conversation with a coworker who's mostly a Python person, embedded systems and data pipelines, not much CAD. He'd seen me use Zoo's text-to-CAD through the browser and wanted to know if he could call it from a script. "I just want to POST a description and get a file back," he said, like it was the most reasonable thing in the world. And it is, in principle. In practice, there are about six decisions between pip install and a working STEP file on disk, and most of them aren't obvious from the documentation.

This is the walkthrough I wrote for him, cleaned up for people who are comfortable with Python but haven't necessarily spent time in CAD tooling. If you already know your way around the text-to-CAD API, this post is about the Python-specific details: the SDK, the code patterns, the places where things break, and the workarounds I've settled on.

Setup#

Install the SDK:

pip install kittycad

The package is called kittycad for historical reasons (Zoo.dev used to be KittyCAD, and the Python package name stuck). The current version at the time of writing is 1.3.5. It pulls in httpx and pydantic as dependencies, which is standard for a modern Python API client.

You need an API token from zoo.dev. Sign up, go to account settings, generate a token. Set it as an environment variable:

export ZOO_API_TOKEN=your-token-here

The SDK reads this automatically when you initialize the client. You can also pass the token directly, but the environment variable approach is cleaner for scripts that might end up in version control. Nobody needs to see your token in a git diff at 2 AM.

The basic workflow#

The pattern is: create client, submit prompt, poll for status, save the file. Here's the minimal version that actually works:

import time
import base64
from kittycad.client import ClientFromEnv
from kittycad.api.ml import create_text_to_cad, get_text_to_cad_part_for_user
from kittycad.models import TextToCadCreateBody, ApiCallStatus

client = ClientFromEnv()

result = create_text_to_cad.sync(
    client=client,
    output_format="step",
    body=TextToCadCreateBody(
        prompt="L-bracket, 3mm thick, 40mm legs, two 5mm holes per leg"
    ),
)

request_id = result.id

while True:
    response = get_text_to_cad_part_for_user.sync(
        client=client,
        id=request_id,
    )
    if response.status in (ApiCallStatus.COMPLETED, ApiCallStatus.FAILED):
        break
    time.sleep(5)

if response.status == ApiCallStatus.COMPLETED:
    for name, content in response.outputs.items():
        if name.endswith(".step"):
            with open("bracket.step", "w") as f:
                f.write(base64.b64decode(content).decode("utf-8"))
            print("Saved bracket.step")
else:
    print(f"Generation failed: {response.error}")

That's about 25 lines of actual code, and most of it is the polling loop. Let me walk through the parts that matter and the parts that trip people up.

Client initialization#

from kittycad.client import ClientFromEnv

client = ClientFromEnv()

ClientFromEnv() looks for ZOO_API_TOKEN in your environment. If it's not there, you get an error at request time, not at initialization, which is mildly annoying because you won't know your token is missing until you've already set up everything else. A quick sanity check after creating the client saves debugging time.

You can also create a client with an explicit token:

from kittycad.client import ClientFromToken

client = ClientFromToken(token="your-token-here")

I use ClientFromEnv for scripts and ClientFromToken for quick interactive testing in a notebook, where I'll paste in a token that I don't want hardcoded anywhere.

Submitting the prompt#

result = create_text_to_cad.sync(
    client=client,
    output_format="step",
    body=TextToCadCreateBody(
        prompt="L-bracket, 3mm thick, 40mm legs, two 5mm holes per leg"
    ),
)

The output_format parameter goes in the URL path (/ai/text-to-cad/step). The prompt goes in the JSON body. The API always returns STEP and glTF by default regardless of what you specify, but the output_format parameter tells it which additional format to include if you want something else.

The .sync() method blocks until the HTTP request completes (not until the generation completes). You get back an object with an id (UUID), a status (usually queued at this point), and the prompt echoed back. The actual geometry generation happens server-side in the background.

There's also an .asyncio() variant if you're using async/await:

result = await create_text_to_cad.asyncio(
    client=client,
    output_format="step",
    body=TextToCadCreateBody(
        prompt="L-bracket, 3mm thick, 40mm legs, two 5mm holes per leg"
    ),
)

I've used the async version when generating multiple parts concurrently. It's the same API underneath, just wrapped in httpx.AsyncClient instead of the sync one.

The polling loop#

This is the least elegant part of the whole workflow, and there's no getting around it. The API is asynchronous. You submit a request, get a job ID, and then keep asking "is it done yet?" until it is. The SDK doesn't have a built-in wait-for-completion helper, which seems like an oversight but probably reflects the fact that different use cases want different polling strategies.

while True:
    response = get_text_to_cad_part_for_user.sync(
        client=client,
        id=request_id,
    )
    if response.status in (ApiCallStatus.COMPLETED, ApiCallStatus.FAILED):
        break
    time.sleep(5)

Five seconds between polls is fine for most cases. Generation typically takes 15 to 90 seconds. If you're generating something simple, you'll poll three or four times. If it's complex, maybe fifteen times. I've experimented with adaptive polling, shorter intervals for the first 30 seconds, longer intervals after that, but the improvement is marginal and the code is uglier.

The status values you'll see: queued, uploaded, in_progress, completed, failed. In practice, most requests go queuedin_progresscompleted or failed. The uploaded state is transient and I've only seen it in logs, never caught it in a polling loop.

Handling the output#

When the generation succeeds, response.outputs is a dictionary where keys are filenames (like output.step, output.gltf) and values are base64-encoded file content. This is the part where people get confused, because you need to decode the base64 before saving:

for name, content in response.outputs.items():
    if name.endswith(".step"):
        step_data = base64.b64decode(content).decode("utf-8")
        with open("my_part.step", "w") as f:
            f.write(step_data)

STEP files are plain text, so decoding to UTF-8 works. For binary formats like STL or glB, you'd write bytes instead:

for name, content in response.outputs.items():
    if name.endswith(".stl"):
        stl_data = base64.b64decode(content)
        with open("my_part.stl", "wb") as f:
            f.write(stl_data)

The filenames in the output dictionary aren't always predictable. I've seen output.step, output.gltf, and variations. Matching on the file extension is safer than matching on the exact filename.

Error handling that actually matters#

About 10 to 15 percent of my requests fail, based on a few hundred generations over the past months. The failure modes break down like this:

The prompt is too vague. You get a 400 error or a failed status with a message like "The prompt must clearly describe a CAD model." This is the most common failure and the easiest to fix. Be more specific. Include dimensions. Describe the shape in terms of features (holes, fillets, extrusions) rather than functions ("something to hold a sensor").

The geometry is too complex. The model tries and fails. You get a failed status, sometimes with a useful error, sometimes with a generic failure message. Simplify the prompt or break the part into simpler components.

Transient failures. Server-side issues, timeouts, bad luck. These are rare but real. A single retry with a short delay usually works.

Here's a pattern I use for production scripts:

import time
import base64
from kittycad.client import ClientFromEnv
from kittycad.api.ml import create_text_to_cad, get_text_to_cad_part_for_user
from kittycad.models import TextToCadCreateBody, ApiCallStatus

client = ClientFromEnv()

def generate_step(prompt, output_path, max_retries=2):
    for attempt in range(max_retries):
        try:
            result = create_text_to_cad.sync(
                client=client,
                output_format="step",
                body=TextToCadCreateBody(prompt=prompt),
            )
        except Exception as e:
            print(f"  Request failed: {e}")
            if attempt < max_retries - 1:
                time.sleep(10)
                continue
            return False

        for _ in range(60):
            response = get_text_to_cad_part_for_user.sync(
                client=client,
                id=result.id,
            )
            if response.status in (
                ApiCallStatus.COMPLETED,
                ApiCallStatus.FAILED,
            ):
                break
            time.sleep(5)

        if response.status == ApiCallStatus.COMPLETED:
            for name, content in response.outputs.items():
                if name.endswith(".step"):
                    step_data = base64.b64decode(content).decode("utf-8")
                    with open(output_path, "w") as f:
                        f.write(step_data)
                    return True

        print(f"  Attempt {attempt + 1} failed: {response.error}")
        time.sleep(5)

    return False

That's the function I call from batch scripts. It retries once, handles both HTTP errors and generation failures, and returns a boolean so the calling code knows whether to continue or log the failure. Nothing fancy, but it catches the cases that a naive single-attempt script misses.

Batch generation#

Generating multiple parts from a list is the workflow that made me write this whole thing. Here's the shape of it:

import csv

parts = []
with open("parts.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        parts.append(row)

for part in parts:
    prompt = (
        f"{part['type']}, {part['length']}mm by {part['width']}mm, "
        f"{part['thickness']}mm thick, {part['holes']} holes "
        f"of {part['hole_diameter']}mm diameter"
    )
    output_file = f"output/{part['name']}.step"
    print(f"Generating {part['name']}...")

    success = generate_step(prompt, output_file)
    if success:
        print(f"  Saved {output_file}")
    else:
        print(f"  FAILED: {part['name']}")

A few things I learned running this kind of batch:

Space your requests. Hitting the API 30 times in rapid succession gets you throttled. A natural delay from the polling loop usually provides enough spacing, but if you're using the async variant to fire requests concurrently, limit concurrency to maybe 3 to 5 at a time.

Log everything. Prompt, request ID, status, error message, filename. When request 23 out of 50 fails and you're trying to figure out why, you want the receipt.

Save the prompt alongside the STEP file. I write a small JSON sidecar for each generated file with the prompt, the request ID, and the timestamp. When I open a STEP file three weeks later and can't remember what I asked for, the sidecar tells me.

The async approach for concurrent generation#

If you want to generate several parts at once instead of one at a time:

import asyncio
from kittycad.client import ClientFromEnv
from kittycad.api.ml import create_text_to_cad, get_text_to_cad_part_for_user
from kittycad.models import TextToCadCreateBody, ApiCallStatus

client = ClientFromEnv()

async def generate_async(prompt, output_path):
    result = await create_text_to_cad.asyncio(
        client=client,
        output_format="step",
        body=TextToCadCreateBody(prompt=prompt),
    )
    while True:
        response = await get_text_to_cad_part_for_user.asyncio(
            client=client,
            id=result.id,
        )
        if response.status in (ApiCallStatus.COMPLETED, ApiCallStatus.FAILED):
            break
        await asyncio.sleep(5)

    if response.status == ApiCallStatus.COMPLETED:
        for name, content in response.outputs.items():
            if name.endswith(".step"):
                import base64
                with open(output_path, "w") as f:
                    f.write(base64.b64decode(content).decode("utf-8"))
                return True
    return False

async def main():
    prompts = [
        ("Flat plate 80x50x3mm with four M4 holes", "plate.step"),
        ("Cylindrical standoff OD 20mm ID 10mm height 15mm", "standoff.step"),
        ("U-bracket 50mm wide 30mm tall 3mm thick", "bracket.step"),
    ]

    semaphore = asyncio.Semaphore(3)

    async def limited(prompt, path):
        async with semaphore:
            return await generate_async(prompt, path)

    results = await asyncio.gather(
        *[limited(p, path) for p, path in prompts]
    )
    print(f"Generated {sum(results)} of {len(prompts)} parts")

asyncio.run(main())

The semaphore limits concurrency to three simultaneous requests. You could go higher, but I haven't tested what the API tolerates before it starts returning errors. Three works reliably and still gives you a meaningful speedup over sequential generation.

DIY alternative: LLM + OpenSCAD#

If you don't want to depend on Zoo's API, you can build a text-to-CAD pipeline entirely in Python using a language model and OpenSCAD. The idea is: send your part description to GPT-4 or Claude, ask it to generate an OpenSCAD script, run openscad as a subprocess to render the geometry, and save the output.

import subprocess
import openai

def text_to_scad(prompt):
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "Generate OpenSCAD code for the following part. Output only valid OpenSCAD code, no explanation."},
            {"role": "user", "content": prompt},
        ],
    )
    return response.choices[0].message.content

def render_stl(scad_code, output_path):
    scad_file = output_path.replace(".stl", ".scad")
    with open(scad_file, "w") as f:
        f.write(scad_code)
    subprocess.run(
        ["openscad", "-o", output_path, scad_file],
        check=True,
        capture_output=True,
    )

This works for simple parts and gives you full control over the LLM, the prompt engineering, and the output pipeline. The downsides are real though: OpenSCAD outputs STL, not STEP. The geometry is CSG, not B-Rep. The LLM generates broken scripts more often than the Zoo API generates broken geometry. And you're paying for LLM API calls plus maintaining the pipeline yourself.

For serious work, I use Zoo's API. For experiments and one-off hacks, the LLM-to-OpenSCAD pipeline is fun and surprisingly capable within its limits. The text-to-CAD API overview covers how these approaches compare.

What I actually use this for#

My current setup is a small Python script that reads part descriptions from a YAML file, generates STEP files via Zoo's API, and drops them in a folder that syncs to my Fusion 360 projects. It runs as a cron job twice a day, picking up any new entries I've added to the YAML file. The whole thing is about 80 lines of Python, and it saves me maybe 15 to 20 minutes per part of manual Fusion 360 modeling for the simple brackets and plates that make up most of my fixture work.

It's not magic. Every generated STEP file still gets opened, measured, and usually edited before I use it. But starting from a generated solid instead of a blank sketch is consistently faster for the kinds of parts where text-to-CAD does well. And the fact that I can define those parts in a text file and generate them programmatically means the whole workflow lives in version control, which makes the project manager in me unreasonably happy.

For the full Zoo-specific tutorial with curl examples and step-by-step progression from first request to production script, see the Zoo text-to-CAD API tutorial. For more on the KittyCAD Python SDK, Zoo's Python documentation is decent and improving.

The text-to-CAD guide covers the broader tool landscape if you're still deciding which approach fits your workflow.

Newsletter

Get new TexoCAD thoughts in your inbox

New articles, product updates, and practical ideas on Text-to-CAD, AI CAD, and CAD workflows.

No spam. Unsubscribe anytime.