From Natural Language to Robot: How Kindly IDE Works Under the Hood

From Natural Language to Robot: How Kindly IDE Works Under the Hood

By Dataset System

"Make me a 4-DOF robotic arm that can pick up objects from a conveyor belt." Type that into Kindly IDE and you get a complete robot design in about 30 seconds. Here's what happens in those 30 seconds.

The Pipeline

The generation pipeline has four stages: interpretation, structure generation, physics validation, and refinement. Each stage can use a different LLM backend—Kindly IDE supports OpenAI (GPT-4o), Anthropic (Claude), and Google (Gemini) out of the box.

Stage 1: Interpretation

Your natural language prompt is sent to the LLM along with a system prompt that defines what a valid robot structure looks like. The model extracts the number and type of joints, kinematic topology, dimensions, end-effector type, and mounting configuration.

Stage 2: Structure Generation

The interpreted requirements are fed into a second LLM call with structured output enforced via a Zod schema. The model must return a JSON object matching the RobotStructure schema exactly—no free-form text, no hallucinated fields.

{
  name: string,
  links: [{ name, visual: { geometry }, collision: { geometry }, inertial: { mass, inertia } }],
  joints: [{ name, type, parent, child, origin, axis, limits }]
}

Structured output means the response is guaranteed to parse. No regex extraction, no "please format your response as JSON" prayers.

Stage 3: Physics Validation

The generated structure passes through the physics linter—this is deterministic code, not another LLM call. It checks positive-definite inertia tensors, triangle inequality, mass consistency, joint limit validity, kinematic tree validity, and self-collision at default and limit poses.

Issues are auto-corrected where possible (e.g., swapping reversed joint limits) and flagged as warnings where human judgment is needed.

Stage 4: Iterative Refinement

After the initial generation, you can refine the robot through conversation. "Make the base link heavier for stability," "Add a camera mount to the end effector." Each refinement goes through the same pipeline, editing the existing valid structure rather than starting from scratch.

Why Multi-Model Support Matters

Different LLMs have different strengths. GPT-4o produces the most creative configurations, Claude generates the most physically realistic inertial properties, and Gemini is fastest for simple robots. You can switch models mid-session—the structured output schema ensures compatibility.

See It in Action

Try the full generation pipeline in your browser or download the desktop IDE.

© 2026 Kindly Robotics