1 · Introduction
Building a custom drone has, historically, been an exercise in cross-referencing: motor datasheets against ESC current ratings, prop diameters against frame wheelbases, flight-controller UART budgets against desired peripherals. Experienced builders do this in their heads. First-time builders spend, by our internal surveys, a median of 22 hours per build across YouTube videos, forum threads, and retailer spreadsheets — and approximately 40% still order at least one incompatible component [1].
DRONA proposes an AI-native alternative: the user describes the design they want in natural language, and a conversational agent returns a complete configuration with 3D visualization, a priced bill of materials, and an exportable firmware stub. This is attractive precisely to the extent that the agent's recommendations are actually correct. Language models, left unconstrained, hallucinate part numbers, invent KV ratings, and confidently specify stack-mount patterns that do not physically exist. The remedy we adopt throughout the product is to ground every recommendation in two deterministic, external systems:
- A specs engine that computes thrust, flight time, current draw, noise, maximum speed, and cost from a typed
DroneConfigobject, using physics-derived expressions with empirically calibrated correction factors. - A compatibility engine that evaluates twelve universal predicates against the same
DroneConfig, producing a severity-tagged report. Anerrorblocks export and is returned to the language model as a constraint violation; the model must then propose an adjustment.
Both systems are deterministic, independently testable, and share a single source of truth with the server's tool-use endpoints and the client's React UI via an isomorphic ECMAScript module (§6). The language model is therefore not the authority on physics; it is the natural-language interface to a physics authority that sits outside the model. This paper describes both systems, their derivation, their calibration, and the evaluation regime we use to guard against drift.
2 · Related work
Open-source aero solvers (XFOIL, XFLR5, AVL, OpenVSP, SU2) provide high-fidelity analysis of fixed-wing and rotor geometries at the CAD-parameter level [2–5]. These tools are not designed for interactive, chat-driven bill-of-material generation; they require the user to have already chosen a geometry and to import it into the solver. Commercial configurators at retailer websites (GetFPV, Pyrodrone, Motion RC) offer parts compatibility filtering but no physics-derived performance estimation. Online calculators such as eCalc [6] provide numerical estimation of thrust and flight time for a chosen set of components but operate on generic parameter inputs rather than a typed configuration.
Within the language-model tool-use literature, function-calling with externally validated outputs is increasingly standard [7, 8]. Our contribution is less a novel LM technique than a careful engineering pattern: place the physics outside the model, expose it through structured tool calls, and surface predicate violations back into the conversation as first-class objects. We are not aware of a prior published description of this pattern applied to drone design.
3 · Configuration model
A DRONA design is a DroneConfig: a typed, JSON-serializable object whose shape is defined in app/src/lib/types.ts. The top-level fields relevant to this paper are:
interface DroneConfig {
frame: FrameSpec;
motors: MotorSpec[];
propellers: PropellerSpec;
battery: BatterySpec;
esc: EscSpec;
flightController: FlightControllerSpec;
camera: CameraSpec;
receiver: ReceiverSpec;
gps: GpsSpec;
payload: PayloadItem[];
// ... mission, constraints, metadata
}
Each sub-object (FrameSpec, MotorSpec, etc.) carries the engineering-relevant fields — wheelbase in millimeters, motor KV and stator dimensions, battery cell count and capacity, ESC continuous and burst amp ratings, propeller diameter and pitch — rather than opaque part numbers. The object is hermetic: given a DroneConfig, every spec and every compatibility check is deterministic. Two callers of the specs engine with byte-identical input must produce byte-identical output.
The pattern is deliberate. It lets us treat the specs and compatibility layers as pure functions, which in turn lets us (a) memoize aggressively on the client, (b) snapshot-test on every change to either engine, (c) share the implementation between server and client without a runtime boundary, and (d) serialize any configuration to an archival format (.vdx, TOML-based) with a guaranteed round-trip.
4 · Specs engine
The specs engine computes the fields of a ComputedSpecs object from a DroneConfig. The primary computed fields are all-up weight (AUW), thrust-to-weight ratio (TWR), maximum speed, hover / cruise / sprint flight time, hover and peak current draw, noise level, and total cost. Below we present the derivation from first principles and the two empirical adjustment factors.
4.1 Weight
All-up weight is the sum of per-part masses plus a wiring overhead factor. Let $m_i$ denote the mass of component $i \in C$ where $C$ is the set of parts in the configuration. Then:
The wiring-overhead factor $\alpha$ accounts for harness mass, solder, hot-glue, TPU mounts, antenna feeds, and other small-part mass that is not individually catalogued in a DroneConfig. The value $0.05$ was fit on a held-out set of 28 weighed builds spanning whoop (25 g class) through heavy-lift hex (2.8 kg class) with a median error of 3.2% on total AUW.
4.2 Static thrust
The canonical textbook expression for static thrust from an ideal actuator disk is:
This expression is accurate to within ~10% on mid-range propellers (5-inch class, 2306–2207 motor stators, 4–6S battery) operating near their design point, but diverges at the ends of the hobby drone spectrum: it underestimates static thrust for large-stator, large-prop combinations (because profile drag is sub-linear in tip velocity) and overestimates it for very small stators (because viscous effects dominate at low Reynolds number). We therefore replace the ideal expression with a base-rate scaled by two empirical geometry factors:
where the three factors are:
The two piecewise regions in $f_s$ and $f_p$ reflect the onset of diminishing returns: above a stator diameter of ≈44 mm, additional stator mass contributes less thrust than its mass cost, a finding consistent with reported T-Motor U-series bench data [9]. Similarly, above approximately 10-inch propellers, the logarithmic scaling matches published prop-efficiency data at low-altitude, sea-level conditions. For micro motors (stator diameter below 14 mm, e.g., 0802 whoop motors), we apply a measured floor derived from Betaflight community whoop thrust tests [10]:
Total thrust is simply $T_{\text{max}} = n_{\text{motor}} \cdot T_{\text{motor}}$, and thrust-to-weight ratio is $\text{TWR} = T_{\text{max}} / m_{\text{AUW}} g$.
4.3 Maximum speed
We estimate maximum forward airspeed as a linear function of TWR, empirically calibrated and capped to prevent extrapolation into the drag-divergence region. The expression is:
This expression is intentionally simple. A more rigorous derivation would estimate drag coefficient from frame geometry and solve the steady-flight power balance; for our target audience and use case, the TWR proxy correlates with observed maximum airspeed ($r^2 = 0.71$ on our 28-build set) and gives the user a reasonable order-of-magnitude answer. We plan to replace Eq. 8 with a form-factor-aware drag-balance expression in the fixed-wing and VTOL releases (§8).
4.4 Hover power and flight time
Hover power is derived from the quasi-actuator-disk model but collapsed into an empirical efficiency expression:
The linear dependence of efficiency on propeller diameter reflects the fact that larger-diameter props at lower disk loadings operate nearer the ideal momentum limit. The intercept of 5 g/W matches a 3-inch cinewhoop benchmark; the slope was fit against a 14-point dataset spanning 2-inch through 15-inch propellers.
Battery energy, with usable fraction $\beta = 0.8$ to reflect safe discharge limits:
Flight times for the three canonical regimes are then:
The 0.75 and 0.45 coefficients reflect the fact that forward cruise is more efficient than hover (wind-penetration + induced-drag reduction), while sprint operates above the optimum. These values were fit against 38 reported flight-time triples on RotorBuilds [11].
4.5 Noise
We estimate noise at 1 m, under the assumption that tip speed dominates acoustic output at our scale [12]:
The $20 \log_{10}$ scaling reflects the approximate fifth-power dependence of tip-vortex noise on tip speed divided by the quadratic spreading loss; the intercept of 60 dB matches our reference 5-inch 2207 build at 4S. For the user-facing UI, we surface noise in three categorical buckets (whisper / conversation / loud) derived from thresholds at 65 dB and 80 dB.
5 · Compatibility rules
The compatibility engine evaluates twelve predicates over a DroneConfig. Each rule is tagged with a severity (error, warning, info), a category, and a set of affected parts which the UI uses to highlight components in the 3D viewer. Errors block export. Warnings surface with an explanation and a suggested remedy. Informational messages provide positive confirmation without interrupting flow.
| Rule ID | What it checks | Severity |
|---|---|---|
prop-clearance | Propeller diameter × 25.4 mm vs motor-to-center distance derived from frame geometry. Collision → error; < 10 mm → warning. | error / warn |
esc-current | ESC continuous amp rating ≥ motor peak draw (per-motor, single-phase worst case). Under-spec → error; < 1.2× → warn. | error / warn |
kv-voltage | Motor KV × battery voltage → target RPM. Outside the 40–120 kRPM healthy band → warning. | warn |
twr | Thrust-to-weight ratio against use-case target (racing ≥ 4, freestyle ≥ 3, cinematic ≥ 2, heavy-lift ≥ 1.5). | warn |
stack-mount | FC and ESC stack-mount patterns match (16×16 / 20×20 / 25.5×25.5 / 30.5×30.5). | error |
motor-count | Motor count matches frame type (quadx→4, hex→6, octo→8, tricopter→3). | error |
weight-class | Classifies AUW and surfaces regulatory implications (250 g · 2 kg · 25 kg · > 25 kg thresholds, FAA/EASA). | info |
battery-c-rating | $C \cdot C_{\text{Ah}} \geq I_{\text{peak}}$; sprint-ready ≥ 150C for racing class. | warn |
esc-voltage | Battery cell count within ESC input range. | error |
uart-budget | Required UART count (GPS + telemetry + receiver + VTX control + ...) ≤ FC UARTs. | error |
prop-shaft | Motor shaft diameter matches prop hub bore (M5 / M8 conventions). | error |
battery-connector | Connector amp rating ≥ peak current (XT30 20 A / XT60 60 A / XT90 90 A / EC5 120 A). | warn |
Each rule is expressed as a pure function $r: \text{DroneConfig} \to \text{RuleResult}$. The overall validation output is the concatenation of all non-pass results, with a final status aggregated as fail if any error is present, warn if any warning is present, else pass.
5.1 Prop-clearance geometry
For a quadcopter in X configuration, motor-to-center distance is $d_m = w / \sqrt{2}$ where $w$ is the wheelbase. Propeller radius is $d_p \cdot 25.4 / 2$ in millimeters. Propeller clearance between adjacent motors is:
The engine generalizes this to hex ($d_m = w/2$), octo ($d_m = w \sin(\pi/8)$), tricopter ($d_m = w \sin(\pi/3)$), and deadcat / stretched-X geometries. A negative clearance raises an error; clearance < 10 mm raises a warning. The warning threshold is informed by the need for gap for air flow and debris during aggressive flight [13].
prop-clearance error; $c < 10$ mm raises a warning.6 · Isomorphic implementation
The specs engine lives in a single file, app/specs-core.mjs, written in plain ECMAScript and imported both by the Node server (server.js) and by the client (app/src/lib/specs-engine.ts via a @ts-expect-error boundary). The same pattern applies to the compatibility engine at app/compat-core.mjs. This architecture has three useful consequences:
- No dual implementation drift. Because the client and server consume the exact same module, there is no opportunity for the two call paths to compute different specs for identical inputs.
- Tool-use safety. When the language model invokes a tool such as
analyze_what_if, the server-side handler calls the samecomputeSpecsCorefunction the client will call when re-rendering. The model receives the actual computed specs, not an LLM approximation. - Testability. Snapshot tests run against
specs-core.mjsdirectly, without a DOM or network. This lets us run the 18-preset regression suite in about 380 ms.
The cost is small: we write the core in plain JavaScript rather than TypeScript, relying on the type signature at the import boundary to catch mismatches. In practice, since both sides import the same DroneConfig TypeScript type, the type information flows through the caller, not through the module itself.
7 · Evaluation
7.1 Snapshot regression
We maintain a snapshot of expected specs output for each of the 18 production presets spanning tiny whoop (25 g) through commercial hex (5.8 kg). The snapshot is checked into version control. Any change to the specs engine either (a) preserves all 18 snapshots byte-identically, or (b) updates them with a reviewed diff documenting the intended change. This regime catches accidental regressions, floating-point drift between Node versions, and off-by-one errors in the piecewise fit.
7.2 Validation against published reference designs
We validate against 11 published reference designs from three independent sources: manufacturer datasheets (T-Motor, DJI Enterprise, BlueRobotics), independent bench tests (FPV WTFOS, Joshua Bardwell), and public DIY build logs (RotorBuilds, the FliteTest community). For each reference we compare computed AUW, TWR, and hover flight time against published values.
| Reference design | AUW err | TWR err | thover err | Agreement |
|---|---|---|---|---|
| iFlight Nazgul Evoque F5D | +2.1% | −4.8% | +6.2% | ±15% |
| Chimera 7 Pro LR | +1.4% | −7.2% | −11.4% | ±15% |
| Mobula 6 (whoop) | +4.0% | +3.2% | +9.8% | ±15% |
| TBS Source One V5 5" | +0.9% | −2.1% | +5.6% | ±15% |
| Armattan Rooster 6" | +3.2% | +1.8% | −4.3% | ±15% |
| DJI Matrice 30 | −5.8% | +0.4% | −14.1% | ±15% |
| DJI Mavic 3 Pro | +2.6% | −6.7% | −8.9% | ±15% |
| Freefly Alta X | −7.4% | −12.1% | +13.4% | ±15% |
| Skydio X10 | +4.1% | −9.8% | −11.2% | ±15% |
| DJI Agras T40 | −11.3% | −18.6% | +21.4% | ±25% |
| Custom 8" cine hex | +8.9% | −14.2% | −19.8% | ±25% |
The two designs outside ±15% are both at the edges of the calibration envelope: the Agras T40 has 30-inch carbon props operating on 14S at an unusually high disk loading, and our 8-inch hex is a custom build whose empirical scaling we had not previously validated. Both remain within the ±25% range we consider acceptable for the specs engine's stated purpose — design-stage estimation for consumer and prosumer builds, not certification-grade analysis.
7.3 Compatibility-rule coverage
Across the 18 production presets, the compatibility engine produces 0 errors (all presets build), 2 warnings (both in the battery-c-rating category on racing presets, which we accept as design choices), and 12 info messages (weight-class classifications). A targeted regression suite of 40 deliberately invalid configurations exercises each rule at least three times; all 40 produce the expected error or warning.
8 · Discussion
We highlight three points that we think are generalizable beyond drones.
External physics is a form of model honesty. A language model that invents a specification is failing quietly. A language model whose specifications are filtered through an external predicate system fails loudly: the predicate raises an error and the model is required to correct itself in the same conversation turn. This turns out to be easier to engineer, and easier to evaluate, than trying to train the model away from hallucination.
Deterministic specs enable aggressive memoization. Because computeSpecsCore is pure and hermetic, the client memoizes on a shallow hash of DroneConfig and recomputes at interactive latency (< 1 ms) during drag-to-adjust interactions. This makes the 3D-viewer's computed-specs overlay tractable without a server round-trip.
Form-factor expansion is a discipline, not a refactor. Our fixed-wing, VTOL, helicopter, and ROV engines (in development) follow the same pattern: a dedicated .mjs per form factor, dispatched from computeSpecsCore. The universal rules (§5) remain universal; form-factor-specific rules (wing loading, disc loading, buoyancy margin) live in their own files. The isomorphic discipline extends cleanly.
9 · Limitations
The engine is explicitly a design-stage estimator. It does not replace:
- A certified aero solver (XFLR5, AVL, CFD) for fixed-wing analysis near critical design points.
- A thermal model for ESC/FC under sustained near-burst operation.
- A battery chemistry model (Warburg impedance, temperature curves) for cold-weather operation.
- A gimbal-jitter model for cinematic applications.
- Any regulatory compliance authority. The weight-class info flags are advisory; a user is responsible for jurisdictional rules.
We view these as out-of-scope for a conversational design tool and in-scope for downstream professional tools. Where appropriate, DRONA exports to the file formats those tools consume (STEP, KML, MAVLink .plan, ArduPilot .param).
10 · Conclusion
We have described a deterministic, isomorphic specs engine and a twelve-rule compatibility system that together ground every recommendation produced by DRONA's conversational design agent. The engine is derived from first-principles physics with two empirical adjustment factors, validated against 11 published reference designs, and regression-tested against 18 production presets. The pattern — place the physics outside the model, expose it through typed tool calls, surface violations as first-class objects — is straightforwardly generalizable to other conversational hardware-design tools.
The accompanying code is linked in §6; the 18-preset snapshot and the 40-case regression suite live alongside the core modules in the DRONA repository. We welcome replication, extension, and critique.
References
- [1] DRONA Labs, "Custom-drone builder pain-point survey," Internal report, 2026.
- [2] M. Drela, "XFOIL: An analysis and design system for low Reynolds number airfoils," Low Reynolds Number Aerodynamics, Springer, 1989.
- [3] M. Drela, XFLR5 analysis of foils and wings operating at low Reynolds numbers, MIT, 2003.
- [4] M. Drela and H. Youngren, "AVL: An extended vortex-lattice model for aerodynamic analysis," MIT, 2004.
- [5] J. R. Gloudemans et al., "OpenVSP: Open-source vehicle sketch pad," NASA, 2012.
- [6] M. Müller, "eCalc — electric flight calculator," ecalc.ch, 2005–present.
- [7] T. Schick et al., "Toolformer: Language models can teach themselves to use tools," NeurIPS, 2023.
- [8] Anthropic, "Tool use with Claude," docs.anthropic.com, 2024.
- [9] T-Motor, "U-series motor bench-test datasheet," store.tmotor.com, accessed 2026-03.
- [10] Betaflight community, "Whoop thrust test compendium," GitHub betaflight/wiki, 2020–2025.
- [11] RotorBuilds community, "Flight-time reported triples," rotorbuilds.com, 2018–2025.
- [12] G. Sinibaldi and L. Marino, "Experimental analysis on the noise of propellers for small UAV," Applied Acoustics, vol. 74, 2013.
- [13] J. Bardwell, "Propeller clearance and tip-vortex noise in 5-inch quads," joshuabardwell.com, 2022.
research@drona.design.