CADFS treats generative CAD as direct generation of FeatureScript, the native language of the Onshape platform.
This choice is central to the method: instead of reducing models to simplified sketch-extrude tokens, FeatureScript preserves the full design history, including higher-level operations such as revolve, sweep, loft, fillet, chamfer, shell, boolean edits, and patterns.
Because the representation remains executable and close to real engineering workflows, it provides a stronger target for learning than synthetic or heavily simplified CAD encodings.
Starting from Onshape's internal model representation, we reconstruct clean and compact FeatureScript programs by extracting the sequence of modeling operations, making implicit parameters explicit, standardizing units and numeric precision, replacing placeholder queries with meaningful references, normalizing random identifiers, simplifying operation definitions, and removing redundant construction steps.
Each recovered program is executed and checked against the source model, so only designs whose code reproduces the original are kept in the dataset.
After reconstructing the programs, we add language supervision with a two-stage annotation pipeline.
One LLM first writes a structured description of the construction process from the FeatureScript code, and a second LLM reviews that draft against the code and the documentation to correct terminology, verify the order of operations, and resolve ambiguous references to geometric entities.
This produces descriptions that align closely with the actual modeling logic and give the training data an explicit link between natural language and executable CAD procedures.
The example below shows why FeatureScript is an effective target representation for this task.
The code does not merely list primitive shapes: it records how sketches are created, how solids are produced from them, which edges or faces are referenced later, and how subsequent refinement and reuse operations modify the model.
In this example, the model is assembled through a sequence of interpretable operations: profiles are drawn with spline, arc, and text primitives; solids are created with revolve and extrude; specific edges are identified through structured queries; those entities are refined with fillets; parts are replicated with a circular pattern; and a loft operation builds a smooth support structure.
Because FeatureScript can refer to geometry through its origin, role, type, and local topology, it supports precise downstream edits and makes the design history understandable both to humans and to language models.