
Humanoid Robots: AI's Best Hope?

As we enter year three since OpenAI catalyzed the current AI wave and capital continues to pour into the field, Dolphin Research’s recent work argues that 2026 will hinge on cutting compute costs and driving AI spend to land on both software and hardware. New hardware rollouts are the true incremental opportunity.
Tesla’s Optimus is nearing mass production, and humanoid robots could become the primary carrier of embodied AI, reshaping human interaction and productivity in profound ways.
Against this backdrop, Dolphin Research is launching coverage of the humanoid robot supply chain. This first piece takes an industry and fundamentals view from the upstream, analyzing the difficulty and opportunities in components and cost-down, with focus on three questions:
1) What are the key links in the humanoid robot supply chain? 2) Where are the industrialization bottlenecks in hardware? 3) Where are the hardware opportunities?
Below is the main text:
I. Humanoids: born AI-native
Start with the basics: humanoid robots feature a human-like 'form' and a human-like 'brain'. Form implies arms, legs, head, upright bipedal locomotion and broad functional versatility.
The 'brain' means multimodal perception, continual learning and decision-making. Combine human form with a human-like brain to pursue generality: not just standing and walking, but moving boxes and brewing coffee, handling heavy lifting and fastening screws on the line.
These skills are not pre-scripted. They must be learned through interaction with diverse external signals, and decisions need to be made autonomously on top of that.
To reach the level of generality required by humanoids, compute, algorithms, data, and tight HW/SW coupling are all indispensable. Breakthroughs in AI algorithms and GPUs/ASICs over the past two years enable rapid iteration on compute and models, but hardware constraints are fundamentally different from the past.
In smartphones and autos, connectivity or transportation provided baseline utility and shipment scale even before intelligence. A humanoid without an intelligent brain is basically a metal figure; shipments must be supported by the AI brain, otherwise the product has little practical utility.
This makes hardware the bigger constraint for this new AI-era category. First, requirements diverge sharply from other industries, and some hardware must be created from scratch. For example, humanoids need highly sensitive 'touch', yet both tactile hardware and data are largely blank today.
Second, costs must be low because the goal is often labor substitution. Back-of-the-envelope projections such as 1 billion humanoids by 2050, vs. 8 billion people and ~5 billion internet users, require an affordable price point to achieve auto-like penetration.
II. Supply chain breakdown: components still in the 'training arc'
Elon Musk targets a sub-$20k price per unit, roughly the entry price of a smart EV. The supply chain complexity is similarly high.
From a value chain lens, the industry can be divided into upstream, midstream, and downstream.
1) Upstream: watch OEM–supplier engagement models
Upstream includes suppliers to the OEMs: actuators, sensors, encoders, controllers/drivers, integrated modules, plus compute, algorithms and chips. Notably, humanoid hardware overlaps heavily with autos, especially NEVs, so OEM–supplier models mirror that sector’s diversity.
Tesla can source parts directly, build its own modules and assemble in-house, or buy modules/sub-assemblies such as dexterous hands or certain joints. As in NEVs, module and sub-assembly supply to the OEM is likely the dominant collaboration path for robots at this stage.
2) Midstream: auto OEM crossovers + new startups
OEMs are the companies that make and sell humanoids, with leading players concentrated in China and the U.S. Beyond $Tesla(TSLA.US) and $XPeng(XPEV.US), the field is dominated by startups; early crossovers are fewer but better capitalized.
3) Downstream: end-demand depends on product readiness
Given today’s limited generality, current use cases are specialized: research, education, and tours. Industrial and household scenarios offer large potential but lack commercialization readiness. When humanoids will generalize is a topic for later pieces; here we focus on upstream hardware progress.
III. Upstream deep-dive: where is the gold mine?
Musk has called humanoids a multi-trillion-dollar opportunity. He most recently said Optimus Gen 3 will demo a prototype in Q1 2026, start mass production by end-2026 (~50k units initial annual run-rate), with a line designed for 1 mn units annually; Gen 4 targeting 10 mn, and Gen 5 possibly 50–100 mn units of capacity.
If Tesla executes, the TAM is obviously large. So we dissect the upstream hardware value chain using Optimus as a reference.
Structurally, Optimus comprises head, body joints and dexterous hands. Below we map Gen 2’s main sections, component locations and our per-unit cost estimates:
(Note: unit costs estimated off current market/production assumptions, RMB, for reference only.)
From an architecture view, humanoids split into a perception layer, decision layer and execution layer, akin to smart autos but more complex. Details as follows:
1) Perception layer: sensors and the brain
The brain refers to AI models, which we set aside here. Sensors include vision, tactile, force/torque, and position.
(i) Vision: Tesla uses a pure 2D camera stack
What is a vision sensor? Think of it as the eyes, converting light into signals for environmental perception, object recognition and localization.
What are the solution paths? Tesla pursues vision-only with 2D cameras; most others use multi-sensor stacks such as 3D cameras (structured light/ToF/stereo), LiDAR and mmWave radar.
Where are the challenges? The tech routes largely mirror consumer electronics and ADAS/AD, but humanoids demand higher dynamic performance, real-time capability, integration and low power.
Key 3D camera vendors include $Orbbec(688322.SH), already engaged with multiple domestic OEMs; LiDAR overlaps with autos, led by $Hesai(HSAI.US) and $ROBOSENSE(02498.HK). However, Tesla needs only three 2D cameras at ~RMB 350 each, so at low shipment scale, value-add to suppliers is limited.
(ii) Tactile: core bottleneck with no settled path
What is a tactile sensor? Think of it as skin, primarily in the hands, sensing contact forces such as pressure, texture, friction and temperature — hence 'electronic skin'. This is a primary hardware bottleneck for humanoids.
Tactile is a new field born with humanoids, with minimal cross-industry reuse. It requires high precision and sensitivity plus consistency, flexibility, reliability, durability and integration — a key area requiring breakthroughs.
For example, precision is constrained by physics, miniaturization/integration and dynamic response; insufficient precision can distort signals, causing models to learn spurious patterns. Consistency is constrained by manufacturing: small variations in materials and process parameters across batches create notable output dispersion.
Long-term drift compounds the issue. These factors can drive model overfitting or noisy training and ultimately poor generalization.
Where are the challenges in production? Material choices (sensitive layers, flexible electrodes), structural design, manufacturing/packaging (lithography, 3D printing), and signal-processing algorithms that decouple multi-dimensional features from single physical signals — all require strong end-to-end capability.
What are the solution paths? Mainly piezoresistive and capacitive. Piezoresistive converts resistance changes to signals with simpler structures but weaker dynamics and consistency; capacitive converts capacitance changes, offering better dynamics/consistency and, while less mature today, is the likely direction of travel.
Value share is modest now but could rise. We estimate Optimus hands need 10+ tactile sensors, worth ~RMB 3,000 per unit today, falling toward ~RMB 1,500 as the industry matures, or ~2% of BOM.
Note the tech stack is not settled. If capacitive replaces piezoresistive and arrays/multimodal sensing become necessary, the value share could increase.
Supplier landscape: Leaders include Novasentis, Tekscan, JDI, Baumer and Fraba across the U.S. and Japan. Chinese players are accelerating:
$Keli(603662.SH) has invested in tactile specialists such as Hisensemi and Yuanshengxianda; $Hanwei Electronics(300007.SZ) is partnering with multiple OEMs and building lines; $Fulai New Material(605488.SH) has pilot lines and is supplying several OEMs.
(iii) Force/torque: six-axis FT is key, import substitution needed
What is a force/torque sensor? It measures force and torque; think of the effort to twist open a bottle cap. Basic 1-axis FT has low barriers, so we focus on six-axis FT.
What is six-axis FT? It measures forces and torques along three axes. Optimus Gen 2 uses four units at wrists and ankles, a core sensor for motion control.
Exhibit: six-axis FT elastomer schematic
Source: Dolphin Research
Where are the challenges? Humanoid-grade six-axis FT requires high integration, dynamic response, overload tolerance and precision. Key hurdles include:
a) structural design to maintain high sensitivity under tiny deformation; b) decoupling algorithms to isolate six components while minimizing cross-axis crosstalk;
c) die-attach/packaging to overcome inconsistency in traditional processes; d) calibration processes with far more dimensions than 1-axis FT to ensure accurate mapping from signal to physical quantity.
Value share is moderate. We estimate four six-axis FT sensors at ~RMB 5,400 per unit today, falling to ~RMB 3,200, or ~3% of BOM.
Supplier landscape: Mature overseas supply exists, led by U.S.-based $ATI(ATI.US) with high barriers. Cost-down likely depends on Chinese entrants such as Keli Sensing, Amperex and $Lingyun Ind(600480.SH), some already in volume programs.
Substitution risk: Algorithmic advances could reduce or replace six-axis FT needs, a key risk to this segment.
(iv) Position: IMU usage could rise sharply; high-end requires localization
What is a position sensor? Primarily the IMU. While FT senses 'force', IMU senses 'position' via accelerometers and gyros for pose estimation, balance and motion — also core to motion control.
Optimus may use two or more IMU controller chips today, with future increases to add redundancy and fault tolerance.
Exhibit: accelerometer and gyro
Source: Innalabs; Dolphin Research
Exhibit: IMU schematic
Source: a referenced IMU patent (Zhuzhou Fisrock Optoelectronics); Dolphin Research
Where are the challenges? Humanoid-grade IMUs require very high precision (accelerometer bias stability, gyro ARW), well above consumer levels, plus advanced manufacturing and fusion algorithms.
Localization progress: IMUs are mature, but high-end grades (industrial/auto/tactical) are controlled by European/U.S. firms such as $BOSCH(BOSCH.NA) and $Honeywell(HON.US), with strong moats. Domestic players are advancing, including $XDLK(688582.SH) (industrial-grade products) and $W-Ibeda(688071.SH) (already supplying OEMs).
Value could increase materially. We estimate two main IMUs at ~RMB 8,400 per unit today, falling to ~RMB 4,200, or ~4% of BOM, with potential upside as architectures evolve.
2) Decision layer: the AI model 'brain' and AI semis, not covered here.
3) Execution layer: the 'cerebellum' and actuators
The cerebellum refers to motion control, effectively a smaller model beneath the foundation model. Actuators center on joints (linear and rotary) and dexterous hands (essentially higher-precision joints). Linear joints pair motors with lead screws; rotary joints pair motors with gear reducers.
(i) Motors: large value share; cost-down likely led by China
All body and hand joints, rotary or linear, require motors. Humanoids are electrically actuated: legacy hydraulic/pneumatic approaches (e.g., Boston Dynamics’ early path) suffer precision issues and poor integration with electronic control, clashing with intelligence needs.
What is special about humanoid motors? Optimus body joints primarily use frameless torque motors; hands use coreless motors, with a possible shift to micro frameless torque units later.
Frameless torque motors deliver high torque, precision, power density, fast response, reliability, lightweight, integration, compactness and attractive cost in tight spaces. Coreless motors fit extreme space constraints in dexterous hands.
Exhibit: frameless torque motor
Source: Kollmorgen; Dolphin Research
Frameless torque motors remove housing and shaft, leaving ring-shaped rotor and stator. The rotor’s ring magnets and steel ring integrate with the joint’s main shaft; the stator’s laminations and windings integrate with the shell.
By eliminating traditional mechanical structures, they achieve high integration, power density and response, meeting humanoid needs for light weight, high load, dynamics, precision and reliability. Hence they are the mainstream body-joint drive today.
Exhibit: coreless motor
Source: MOONS'; Dolphin Research
Coreless motors feature a self-supporting cup-shaped coil without iron cores, with the stator built from permanent magnets. This design avoids cogging and core losses, reduces mass, and delivers compact size, smooth operation, fast dynamics, high efficiency/low heat and high power density.
In dexterous hands, they meet requirements for extreme integration, high precision/stability in grasping, fast dynamics and high reliability.
Where are the challenges? While motor tech is mature, high-end products in China still lag: frameless torque motors require advanced magnetic circuit design, assembly processes, materials and structures; coreless motors demand sophisticated coil design and winding processes. These hurdles are surmountable.
What is the value share? We estimate ~40 motors per Optimus at ~RMB 28,800 per unit today, falling to ~RMB 14,400, or ~14% of BOM, notably above the sensors above.
Industry landscape: Frameless leaders include Kollmorgen (U.S.), TQ-RoboDrive (Germany) and Maxon (Switzerland). Coreless leaders include Maxon, Portescap and Faulhaber.
In China, suppliers include Autoware (frameless torque leader in industrial/cobots, now shipping to humanoid OEMs), MOONS' (leading coreless), Inovance (coreless mass-production with overseas engagements), ZHAOWEI (early overseas collaborations), Wolong Electric (frameless torque shipments to domestic OEMs), Leadshine (mature frameless/coreless portfolios), and $Sanhua(002050.SZ) $SANHUA(02050.HK) (in-house frameless torque R&D).
(ii) Lead screws: likely to become mainstream; a key import-substitution target
Why split body joints into linear vs. rotary? Wrists/shoulders and knees/ankles are fundamentally rotary, sometimes multi-DoF. Robots need both.
a) Rotary joints rotate around a pivot; b) Linear joints move along a line. Linear actuators address some rotary drawbacks.
Human jumping relies on quasi-linear muscle contraction, which robots can emulate. Versus rotary, linear can deliver more precise torque output, better efficiency (motors idle when static), and higher structural rigidity/impact tolerance — especially with planetary roller screws.
What is a lead screw? It converts rotary motion to linear motion via a screw and nut; think turning a screwdriver to drive a screw straight in.
In operation, the motor drives the nut, rollers engage with the screw’s helical grooves, and the screw translates linearly. Versus ball screws, planetary roller screws offer higher load and stiffness (line vs. point contact), greater thrust and self-locking (no power draw when static) — well-suited for humanoids.
But technical difficulty is higher. They require tighter precision and superior alloy performance, demanding advanced heat treatment and precision grinding; thread grinders are still imported.
Exhibit: ball screw schematic
Source: Nanjing Chemical Fiber; Dolphin Research
Exhibit: planetary roller screw schematic
(1 = screw; 2 = nut; 3 = rollers) Source: a referenced patent (CSIC 704 Institute); Dolphin Research
Exhibit: planetary roller screw
Source: Bethel Tech; Dolphin Research
Humanoid use is a new application. Historically, planetary roller screws served aerospace, defense and heavy industry. In humanoids, requirements emphasize precision (≥C5), size, power density and dynamics, while being far more cost-sensitive, raising development difficulty.
Some domestic OEMs still favor rotary joints due to control complexity, but linear adoption should rise over time given the advantages. Tesla reportedly uses planetary roller screws in body joints and ball screws in hands; others use ball screws in body joints for now, but routes may change.
What is the value share? At ~RMB 2,400 each and high counts, screws can reach ~20% of BOM (e.g., ~14 planetary roller screws for the body and ~12 miniature ball screws for hands) — the single largest component bucket. This segment features meaningful process/equipment challenges and warrants close attention.
Industry landscape: Japan/Europe lead with THK, NSK, Schaeffler (planetary roller) and HIWIN (Taiwan). In China, leaders advancing include Hengli (planetary roller, early N. America engagements), Xinjian Transmission (working with multiple OEMs), WZXC (front-end processes for local and overseas leaders), Zhejiang Rongtai (acquired KGG, with planetary roller capacity), Bethel Tech, Zhenyu Tech, Shuanglin and Twin-Transmission.
Equipment remains a choke point, as thread grinders are critical and currently imported. Domestic candidates to break through include Qinchuan Machine Tool, Huachen Equipment and Zhejiang Hedeman.
(iii) Gear reducers: large value share; high-end still not fully localized
Rotary joints rely on reducers to convert high-speed/low-torque motor output into low-speed/high-torque with precision. Humanoid types include planetary, harmonic and cycloidal/RV.
Exhibit: planetary reducer
Source: Punico (Taiwan); Dolphin Research
Planetary reducers suit heavy loads, high torque and impact resistance, thus common in lower limbs. Tesla reportedly uses planetary reducers in hands for now due to cost, with possible changes later.
Exhibit: harmonic reducer schematic
Source: Robot China; Dolphin Research
Harmonic reducers comprise a wave generator, flexspline and circular spline. The wave generator deforms the flexspline to engage/disengage teeth with the circular spline as it rotates, yielding slow, precise output.
The advantages are very high ratios, high precision and compact size, with limited load and impact tolerance, thus common in upper limbs.
Exhibit: cycloidal reducer schematic
Source: International Trade Group; Dolphin Research
RV reducers use two stages: planetary plus cycloidal. The input drives an eccentric shaft causing the cycloidal disc to revolve and rotate within the pin gear, achieving high ratios.
RV offers high load, precision and ratios but with complex structures and high cost. Cycloidal designs can simplify RV while keeping some benefits. Tesla mainly uses harmonic today, with potential cycloidal adoption in select joints later.
Exhibit: reducer vs. screw usage by OEM/body joint
What is the value share? Harmonic reducers account for ~16% of Optimus BOM, second only to screws. Technical barriers are high; for instance, harmonics are dominated by Japan’s Harmonic Drive.
Industry landscape: High-end planetary, harmonic and RV reducers are led by Japanese/German firms, with Harmonic Drive holding over half the global harmonic market. Key hurdles include materials (e.g., special alloy steels for flexsplines), ultra-precision gear machining, high-performance parts such as bearings (with demanding heat-treatment), and design/simulation for tooth forms and mechanics.
China’s progress: Early localization in high-end products is underway. Leaders include Leaderdrive (harmonic, early N. America engagements), Zhongdalide (planetary/harmonic/RV) and Twin-Transmission (RV leader).
Takeaways: the bottleneck is tactile; the opportunities are screws and reducers
Based on value share and entry barriers, the most challenging link is the new category-creating demand for electronic skin, i.e., tactile sensors. The larger value pools with process/equipment difficulty are the two joint components: lead screws for linear joints and reducers for rotary joints.
Dolphin Research summarizes as follows:
Note that humanoids are not yet in mass production, and tech routes have not converged. Our analysis reflects today’s mainstream architectures and could change as new hardware emerges or some parts are replaced.
Also, sub-assemblies are core in their own right. For example, dexterous-hand assembly is harder to industrialize than individual parts such as screws or reducers, and its moat spans manufacturing plus software and HW/SW co-design — to be detailed in our next piece.
<正文完>
Risk disclosure and disclaimer:Dolphin Research Disclaimer and General Disclosure
The copyright of this article belongs to the original author/organization.
The views expressed herein are solely those of the author and do not reflect the stance of the platform. The content is intended for investment reference purposes only and shall not be considered as investment advice. Please contact us if you have any questions or suggestions regarding the content services provided by the platform.
