Curieux.JY
  • JungYeon Lee
  • Post
  • Lecture
  • Note

On this page

  • ๐Ÿ” Ping Review
  • ๐Ÿ”” Ring Review
    • ์„œ๋ก 
    • ๋ฐฉ๋ฒ•
      • 1๋‹จ๊ณ„: ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ํ๋ ˆ์ด์…˜
      • 2๋‹จ๊ณ„: ์‚ฌ์ „ํ•™์Šต
      • 3๋‹จ๊ณ„: RL ๋ฏธ์„ธ์กฐ์ • (ํ•ต์‹ฌ)
      • ํ•˜๋“œ์›จ์–ด ๋ฐฐํฌ
    • ์‹คํ—˜
      • ํ•˜๋“œ์›จ์–ด ๊ฒฐ๊ณผ (์ •์„ฑ)
      • ์ •๋Ÿ‰ ๊ฒฐ๊ณผ โ‘ : ์˜จ๋ผ์ธ ๋ชจ์…˜ ์ƒ์„ฑ์˜ ํšจ๊ณผ
      • ์ •๋Ÿ‰ ๊ฒฐ๊ณผ โ‘ก: ์ถ”์ ๊ธฐ ๋ฏธ์„ธ์กฐ์ •์˜ ํšจ๊ณผ
    • ๋น„ํŒ์  ๊ณ ์ฐฐ
    • ์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

๐Ÿ“ƒWhole-Body Loco

humanoid
locomotion
RL
diffusion
motion-tracking
Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking
Published

June 6, 2026

  • Paper Link (arXiv:2604.17335)
  • Project Page
  1. ๐Ÿค– ๋ณธ ๋…ผ๋ฌธ์€ Motion Generation๊ณผ Motion Tracking์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๊ณ ์ฐจ์› ์ œ์–ด ๋ฐ ์ง€ํ˜• ์ ์‘์˜ ์–ด๋ ค์›€์ด ์žˆ๋Š” ์ „์‹  ํœด๋จธ๋…ธ์ด๋“œ ๋กœ์ฝ”๋ชจ์…˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
  2. โš™๏ธ ์ด ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์ง€ํ˜• ์ธ์‹ Motion Generation์„ ์œ„ํ•œ Diffusion ๋ชจ๋ธ๊ณผ Motion Tracking์„ ์œ„ํ•œ RL ๊ธฐ๋ฐ˜ ํŠธ๋ž˜์ปค๋ฅผ ์‚ฌ์ „ ํ•™์Šตํ•œ ํ›„, ๊ณ ์ •๋œ ์ƒ์„ฑ๊ธฐ์™€ ํ•จ๊ป˜ ํŠธ๋ž˜์ปค๋ฅผ Closed-loop๋กœ ๋ฏธ์„ธ ์กฐ์ •ํ•˜์—ฌ ๊ฐ•๊ฑด์„ฑ๊ณผ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
  3. ๐Ÿš€ Unitree G1 ๋กœ๋ด‡์— ๋ฐฐํฌ๋œ ์ด ์‹œ์Šคํ…œ์€ ๋‹ค์–‘ํ•œ ๋ณตํ•ฉ ์ง€ํ˜•์—์„œ ์„ฑ๊ณต์ ์ธ ํšก๋‹จ์„ ์‹œ์—ฐํ–ˆ์œผ๋ฉฐ, ์˜จ๋ผ์ธ Motion Generation ๋ฐ ํŠธ๋ž˜์ปค ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•œ ์ •๋Ÿ‰์  ๊ฐœ์„ ์„ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ” Ping Review

๐Ÿ” Ping โ€” A light tap on the surface. Get the gist in seconds.

์ด ๋…ผ๋ฌธ์€ ๋ชจ์…˜ ์ƒ์„ฑ(Motion Generation)๊ณผ ๋ชจ์…˜ ํŠธ๋ž˜ํ‚น(Motion Tracking)์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ „์‹ (Whole-Body) ํœด๋จธ๋…ธ์ด๋“œ ๋กœ๋ด‡์˜ ์ง€๊ฐ ๊ธฐ๋ฐ˜(perceptive) ์ด๋™(Locomotion)์„ ํ•™์Šตํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ๋†’์€ ์ž์œ ๋„(Degrees of Freedom)์™€ ํ˜•ํƒœํ•™์  ๋ถˆ์•ˆ์ •์„ฑ(morphological instability)์œผ๋กœ ์ธํ•ด ํœด๋จธ๋…ธ์ด๋“œ ๋กœ๋ด‡์˜ ์ด๋™์€ ๊นŒ๋‹ค๋กญ์Šต๋‹ˆ๋‹ค. ๊ธฐ์กด ๊ฐ•ํ™” ํ•™์Šต(Reinforcement Learning, RL)์€ ์ข…์ข… ํ•˜์ฒด ์œ„์ฃผ์˜ ๋™์ž‘์„ ํ•™์Šตํ•˜๊ฑฐ๋‚˜, ๋ชจ๋ฐฉ ๊ธฐ๋ฐ˜ RL์€ ๋ ˆํผ๋Ÿฐ์Šค ๋ชจ์…˜(reference motions) ์žฌ์ƒ์— ๊ตญํ•œ๋˜์–ด ์ง€ํ˜•์— ๋Œ€ํ•œ ์˜จ๋ผ์ธ ์ ์‘์„ฑ(online adaptation)์ด ๋ถ€์กฑํ•˜๋‹ค๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์€ ์„ธ ๊ฐ€์ง€ ์ฃผ์š” ๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค:

  1. ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ํ๋ ˆ์ด์…˜ (Data Collection & Curation):
    • ์ดˆ๊ธฐ ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋Š” ์ž์ฒด ์ดฌ์˜๋œ ๋ชจ์…˜ ๋น„๋””์˜ค์™€ ๊ณต๊ฐœ ๋ฐ์ดํ„ฐ์…‹([12, 30])์—์„œ ์ˆ˜์ง‘๋œ ์•ฝ 5๋ถ„ ๋ถ„๋Ÿ‰์˜ ๋ชจ์…˜ ํด๋ฆฝ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” 50cm ์ƒ์ž ๋“ฑ๋ฐ˜, 35cm ํ—ˆ๋“ค ๋„˜๊ธฐ, 50cm ์ƒ์ž์—์„œ ์ ํ”„, 20cm ๊ณ„๋‹จ ์˜ค๋ฅด๋‚ด๋ฆฌ๊ธฐ ๋“ฑ ๋‹ค์–‘ํ•œ ์ง€ํ˜• ์Šคํ‚ฌ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
    • ๋น„๋””์˜ค์—์„œ ์ถ”์ถœ๋œ ๋ชจ์…˜์€ GVHMR [31]์„ ์‚ฌ์šฉํ•˜์—ฌ ์ธ๊ฐ„ ๋ชจ์…˜์„ ์žฌ๊ตฌ์„ฑ(reconstruct)ํ•œ ํ›„, Drake [32]์˜ ์ ‘์ด‰ ์ œ์•ฝ์ด ์žˆ๋Š” ์—ญ์šด๋™ํ•™(contact-constrained IK) ์†”๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํœด๋จธ๋…ธ์ด๋“œ ๋กœ๋ด‡์— ๋ฆฌํƒ€๊ฒŸํŒ…(retarget)๋ฉ๋‹ˆ๋‹ค.
    • ์ดํ›„, ์ดˆ๊ธฐ ๋ฆฌํƒ€๊ฒŸํŒ…๋œ ๊ถค์ (trajectories)์„ ์ง์ ‘ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ , DeepMimic ์Šคํƒ€์ผ์˜ ํŠธ๋ž˜ํ‚น ์ •์ฑ…(tracking policy)์œผ๋กœ Refineํ•˜์—ฌ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•œ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
    • ๋ชจ์…˜ ์ฆ๊ฐ•(Motion Augmentation)์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ์…‹์„ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค. ์žฅ์• ๋ฌผ ๋†’์ด ์กฐ์ ˆ(์˜ˆ: 35-75cm ์ƒ์ž, 25-45cm ํ—ˆ๋“ค, 15-20cm ๊ณ„๋‹จ) ๋ฐ ๋ฌด์ž‘์œ„๋กœ ์ž‘์€ ์ƒ์ž ์‚ฝ์ž… ๋“ฑ์œผ๋กœ ์ง€ํ˜• ๊ธฐํ•˜ํ•™(terrain geometry)์„ ๋‹ค์–‘ํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ตœ์ ํ™”๋œ ๊ถค์  ๋˜ํ•œ ํŠธ๋ž˜ํ‚น ์ •์ฑ…์„ ํ†ตํ•ด Refine๋˜์–ด ๋ฌผ๋ฆฌ์  ํƒ€๋‹น์„ฑ์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
  2. ์‚ฌ์ „ ํ›ˆ๋ จ (Pre-training Stage):
    • ์ „์‹  ๋ชจ์…˜ ํŠธ๋ž˜์ปค (Whole-body Motion Tracker): DeepMimic ์Šคํƒ€์ผ์˜ RL ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ผ ์ „์‹  ๋ชจ์…˜ ํŠธ๋ž˜์ปค๋ฅผ ํ›ˆ๋ จํ•ฉ๋‹ˆ๋‹ค. PPO ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๋ฉฐ, IsaacLab [33]์—์„œ ์ง„ํ–‰๋ฉ๋‹ˆ๋‹ค.
      • ๋ณด์ƒ ํ•จ์ˆ˜๋Š” ๋ชจ๋ฐฉ ๋ณด์ƒ(imitation reward) r_{mimic}์™€ ์ •๊ทœํ™” ํ•ญ(regularization terms) r_{reg}๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค: R_{pre} = r_{mimic} + r_{reg}.
      • r_{mimic}๋Š” Base Pose Tracking, Base Velocity Tracking, Joint Position Tracking, Joint Velocity Tracking, Body Pos Tracking ๋“ฑ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
      • r_{reg}๋Š” Action Rate (First-Order, Second-Order), Joint Position Limits, Joint Velocity Limits, Torque Limits, Joint Torques, Body Linear Acceleration ๋“ฑ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
      • ํŠธ๋ž˜์ปค์˜ ๊ด€์ธก(observation)์€ ๋ ˆํผ๋Ÿฐ์Šค ์ƒํƒœ(์„ ํ˜•/๊ฐ์†๋„, ๊ด€์ ˆ ์œ„์น˜/์†๋„, ์ฃผ์š” ์‹ ์ฒด ์œ„์น˜), ๊ณ ์œ  ์ˆ˜์šฉ์„ฑ(proprioceptive) ์ •๋ณด(์ด์ „ ์•ก์…˜ ํฌํ•จ), ์ง€ํ˜• ๋†’์ด ์Šค์บ”(terrain height scans)์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.
      • ์ •์ฑ…์€ 23์ฐจ์› ํƒ€๊ฒŸ ๊ด€์ ˆ ์œ„์น˜(target joint positions)๋ฅผ ์•ก์…˜์œผ๋กœ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    • ํ™•์‚ฐ ๊ธฐ๋ฐ˜ ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ (Diffusion-based Motion Generator): MDM [17, 21] ์•„ํ‚คํ…์ฒ˜๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ™•์‚ฐ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ „์‹  ๋ ˆํผ๋Ÿฐ์Šค ๋ชจ์…˜ ์‹œํ€€์Šค๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
      • ๋ชจ๋ธ์€ 0.5์ดˆ ์˜ˆ์ธก ๋ฒ”์œ„(25ํ”„๋ ˆ์ž„)์— ๊ฑธ์ณ ๋ฏธ๋ž˜ ๋ชจ์…˜ ํŠน์ง•(root position/orientation, joint positions, body link positions)์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
      • ํƒ€๊ฒŸ ํ—ค๋”ฉ ๋ฒกํ„ฐ(target heading vector), ์ง€ํ˜• ๋†’์ด ์Šค์บ”, ๊ณผ๊ฑฐ ๋‘ ํ”„๋ ˆ์ž„์˜ ๋ชจ์…˜ ํŠน์ง•์„ ์กฐ๊ฑด์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค.
      • ํ›ˆ๋ จ ์ค‘, ๋ฐ์ดํ„ฐ์…‹์—์„œ ๋ฌด์ž‘์œ„๋กœ ๋ชจ์…˜ ์‹œํ€€์Šค๋ฅผ ์ƒ˜ํ”Œ๋งํ•˜๊ณ , ๋ฒ ์ด์Šค ํฌ์ฆˆ(base pose) ์ฐจ์ด๋กœ๋ถ€ํ„ฐ ํ—ค๋”ฉ ๋ฒกํ„ฐ๋ฅผ ๊ณ„์‚ฐํ•˜๋ฉฐ, ์ฒ˜์Œ ๋‘ ํ”„๋ ˆ์ž„์„ ์กฐ๊ฑด์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๋‚˜๋จธ์ง€ ํ”„๋ ˆ์ž„์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
      • ์žฌ๊ตฌ์„ฑ ์†์‹ค(reconstruction loss) ์™ธ์— ์†๋„, ๊ด€์ ˆ ์ผ๊ด€์„ฑ, ์ง€ํ˜• ์นจํˆฌ(terrain penetration) ์†์‹ค๊ณผ ๊ฐ™์€ ๊ธฐํ•˜ํ•™์  ์†์‹ค(geometric losses)์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
      • ํ›ˆ๋ จ ์ค‘ ๋†’์ด ์Šค์บ” ๋ฐ ์ด์ „ ์ƒํƒœ ์กฐ๊ฑด์— ์ถ”๊ฐ€์ ์ธ ๋…ธ์ด์ฆˆ๋ฅผ ์ฃผ์ž…ํ•˜์—ฌ ๊ฐ•๊ฑด์„ฑ(robustness)์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
  3. RL ๋ฏธ์„ธ ์กฐ์ • (RL Fine-tuning Stage):
    • ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ์ƒ์„ฑ๊ธฐ์™€ ํŠธ๋ž˜์ปค๋ฅผ ์ง์ ‘ ๊ฒฐํ•ฉํ•  ๊ฒฝ์šฐ, ์ƒ์„ฑ๋œ ๋ชจ์…˜์˜ ์•„ํ‹ฐํŒฉํŠธ(artifacts)์™€ ๋ถˆ์™„์ „ํ•œ ๋ ˆํผ๋Ÿฐ์Šค๋กœ ์ธํ•œ ํŠธ๋ž˜ํ‚น ์‹คํŒจ, ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹์„ ๋ฒ—์–ด๋‚˜๋Š” ์ง€ํ˜• ๋ฐ ํƒ€๊ฒŸ ๋ฐฉํ–ฅ์— ๋Œ€ํ•œ ์ผ๋ฐ˜ํ™” ๋ถ€์กฑ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ๋Š” ๊ณ ์ •(frozen)๋œ ์ƒํƒœ๋กœ ์œ ์ง€ํ•˜๊ณ  ๋ชจ์…˜ ํŠธ๋ž˜์ปค๋ฅผ RL๋กœ ๋ฏธ์„ธ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
    • ๋ฏธ์„ธ ์กฐ์ • ๋‹จ๊ณ„์—์„œ ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ๋Š” ๋กœ๋ด‡ ์ƒํƒœ์˜ ๊ณผ๊ฑฐ ๋‘ ํ”„๋ ˆ์ž„์„ ์กฐ๊ฑด์œผ๋กœ ๋ ˆํผ๋Ÿฐ์Šค ํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•˜์—ฌ ํ์‡„ ๋ฃจํ”„(closed-loop) ๋ชจ์…˜ ์˜ˆ์ธก ํ”„๋กœ์„ธ์Šค๋ฅผ ํ˜•์„ฑํ•ฉ๋‹ˆ๋‹ค.
    • ์˜จ๋ณด๋“œ ๊ณ„์‚ฐ ์ œํ•œ ํ•˜์— ๋ฐฐํฌ ์ง€์—ฐ ์‹œ๊ฐ„(deployment latency)์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ์—์„œ ๋‹จ 2๊ฐœ์˜ Denoising ๋‹จ๊ณ„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
    • ์‹ค์ œ ํ™˜๊ฒฝ ๊ต๋ž€(real-world disturbances)์— ๋Œ€ํ•œ ๊ฐ•๊ฑด์„ฑ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ ์ถ”๋ก  ํ”„๋กœ์„ธ์Šค ๋ฐ ํŠธ๋ž˜์ปค ๊ด€์ธก ๋ชจ๋‘์— ์ถ”๊ฐ€ ๋…ธ์ด์ฆˆ๋ฅผ ์ฃผ์ž…ํ•ฉ๋‹ˆ๋‹ค.
    • ํƒ€๊ฒŸ ํ—ค๋”ฉ ๋ฐฉํ–ฅ์„ ๋ฌด์ž‘์œ„ํ™”ํ•˜๊ณ , ํ—ค๋”ฉ ํŠธ๋ž˜ํ‚น ๋ณด์ƒ r_{task}๋ฅผ ๋„์ž…ํ•˜์—ฌ ์‹œ์Šคํ…œ์ด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ์…˜ ์‚ฌ์ „(motion priors)์„ ํ™œ์šฉํ•˜๋ฉด์„œ ์›ํ•˜๋Š” ๋ฐฉํ–ฅ์„ ๋”ฐ๋ฅด๋„๋ก ์žฅ๋ คํ•ฉ๋‹ˆ๋‹ค. ์ „์ฒด ๋ณด์ƒ ํ•จ์ˆ˜๋Š” R_{post} = r_{mimic} + r_{reg} + r_{task}๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. r_{task}๋Š” ๋ชฉํ‘œ ๋ฐฉํ–ฅ ๋ฒกํ„ฐ d์— ๋Œ€ํ•œ base linear velocity v_b์˜ ์ •๋ ฌ์„ ๋ณด์ƒํ•˜๋ฉฐ \langle v_b, d \rangle / ||v_b||๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค.
    • ํ›ˆ๋ จ ์ง€ํ˜•์„ ๋”์šฑ ๋‹ค์–‘ํ™”ํ•˜์—ฌ 15-25cm ์Šคํ… ๋†’์ด์˜ ๊ณ„๋‹จ, 25-55cm ๋†’์ด์˜ ์—ฐ์†์ ์ธ ํ—ˆ๋“ค, 30-85cm ๋†’์ด์˜ ๋‹ค์–‘ํ•œ ๋„ˆ๋น„์™€ ์š”/ํ”ผ์น˜ ๊ฐ๋„๋ฅผ ๊ฐ€์ง„ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋“ฑ๋ฐ˜ ์ƒ์ž ๋ฐ ํ”ผ๋ผ๋ฏธ๋“œ ์Šคํ…์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
    • ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•ด ํŠธ๋ž˜ํ‚น ์ •์ฑ…์€ ๋ชจ์…˜ ํ•„ํ„ฐ(motion filter) ์—ญํ• ์„ ํ•˜๋ฉฐ, ์ƒ์„ฑ๊ธฐ๊ฐ€ ์ƒ์„ฑํ•œ ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ์ถ”์ ํ•˜๋ฉด์„œ ์™ธ๋ถ€ ์ง€ํ˜• ๊ด€์ธก์„ ์‚ฌ์šฉํ•˜์—ฌ ์•ˆ์ „ํ•˜์ง€ ์•Š์€ ๋™์ž‘์„ ์–ต์ œํ•ฉ๋‹ˆ๋‹ค.

ํ•˜๋“œ์›จ์–ด ๋ฐฐํฌ๋ฅผ ์œ„ํ•ด ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์€ Unitree G1 ๋กœ๋ด‡์— ์˜จ๋ณด๋“œ๋กœ ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. DLIO [34]๋Š” LiDAR์™€ IMU๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋กœ๋ด‡ ๋ฒ ์ด์Šค ํฌ์ฆˆ๋ฅผ ์ถ”์ •ํ•˜๊ณ , ์ด๋Š” ๋ชจ์…˜ ์ƒ์„ฑ์— ์ž…๋ ฅ๋ฉ๋‹ˆ๋‹ค. ์ง€ํ˜• ์ธ์‹์„ ์œ„ํ•ด Elevation Mapping CuPy [35]๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋ชจ์…˜ ์ƒ์„ฑ ๊ฐ€์†ํ™”๋ฅผ ์œ„ํ•ด TensorRT๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก  ์‹œ๊ฐ„์„ ์•ฝ 0.02์ดˆ๋กœ ๋‹จ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ƒ์„ฑ๊ธฐ๋Š” Jetson Thor์—์„œ, ํŠธ๋ž˜์ปค ๋ฐ ๊ธฐํƒ€ ๋ชจ๋“ˆ์€ Jetson Orin์—์„œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆ๋œ ์‹œ์Šคํ…œ์ด ์ƒ์ž ๋“ฑ๋ฐ˜/ํ•˜๊ฐ•, ๊ณ„๋‹จ ์˜ค๋ฅด๋‚ด๋ฆฌ๊ธฐ, ์—ฐ์† ํ—ˆ๋“ค ๋„˜๊ธฐ ๋“ฑ ๋‹ค์–‘ํ•œ ์ง€ํ˜•์—์„œ ํšจ๊ณผ์ ์ž„์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํŠนํžˆ, ์ƒ์ž ์œ„์—์„œ ๋ฐฉํ–ฅ์„ ๋ฐ”๊พธ๋Š” ๋“ฑ์˜ ๋‹ค์žฌ๋‹ค๋Šฅํ•œ ๋“ฑ๋ฐ˜ ๋™์ž‘๊ณผ, ์ง€ํ˜• ์กฐํ•ฉ(vaulting, stairs, box climbing)์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ๋ชจ์…˜ ์Šคํƒ€์ผ์„ ์ „ํ™˜ํ•˜๋Š” ๋Šฅ๋ ฅ์„ ์ž…์ฆํ•ฉ๋‹ˆ๋‹ค. ์ •๋Ÿ‰์  ๋ถ„์„์€ ์˜จ๋ผ์ธ ๋ชจ์…˜ ์ƒ์„ฑ(Online Motion Generation)์ด ์ผ๋ฐ˜ํ™”(generalization)์— ์ค‘์š”ํ•˜๋ฉฐ, ํŠธ๋ž˜์ปค ๋ฏธ์„ธ ์กฐ์ •(Tracker Fine-tuning)์ด ๊ฐ•๊ฑด์„ฑ(robustness) ๋ฐ ์„ฑ๊ณต๋ฅ  ํ–ฅ์ƒ์— ํ•„์ˆ˜์ ์ž„์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๊ณ ์ •๋œ ๋ ˆํผ๋Ÿฐ์Šค ํŠธ๋ž˜ํ‚น(Fixed-Reference Tracking) ๋Œ€๋น„ ์˜จ๋ผ์ธ ๋ชจ์…˜ ์ƒ์„ฑ ๊ธฐ๋Šฅ์ด ์žˆ๋Š” ์‹œ์Šคํ…œ์€ ์ง€ํ˜• ๋ณ€ํ™”์— ๋Œ€ํ•œ ํ›จ์”ฌ ๋†’์€ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์˜€์œผ๋ฉฐ, ๋ฏธ์„ธ ์กฐ์ •์€ ํŠนํžˆ ๋„์ „์ ์ธ ์ˆ˜์ง ์ „ํ™˜(vertical transitions) ๋ฐ ๋” ํฐ ์Šคํ… ๋†’์ด์—์„œ ํŠธ๋ž˜์ปค์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค.


๐Ÿ”” Ring Review

๐Ÿ”” Ring โ€” An idea that echoes. Grasp the core and its value.

์„œ๋ก 

์‚ฌ์กฑ๋ณดํ–‰(quadruped)์—์„œ๋Š” ๋”ฅ RL์ด ํ—˜์ง€ ์ฃผํ–‰์— ํฐ ์„ฑ๊ณต์„ ๊ฑฐ๋’€์ง€๋งŒ, ํœด๋จธ๋…ธ์ด๋“œ๋กœ ์˜ฎ๊ฒจ๊ฐ€๋ฉด ๋‚œ๋„๊ฐ€ ๊ธ‰๊ฒฉํžˆ ์˜ฌ๋ผ๊ฐ‘๋‹ˆ๋‹ค. ์‚ฌ๋žŒ ํ˜•ํƒœ์˜ ๋กœ๋ด‡์€ ์ž์œ ๋„๊ฐ€ ํ›จ์”ฌ ๋งŽ๊ณ , ํ˜•ํƒœ์ ์œผ๋กœ ๋ถˆ์•ˆ์ •ํ•˜๋ฉฐ, ํ˜‘์‘๋œ ์ „์‹  ๋™์ž‘์ด ์—†์œผ๋ฉด ํฐ ์žฅ์• ๋ฌผ์„ ๋„˜์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ํœด๋จธ๋…ธ์ด๋“œ ํŒŒ์ฟ ๋ฅด(parkour) ์ฒ˜๋Ÿผ ๋ฐ•์Šค์— ๊ธฐ์–ด์˜ค๋ฅด๊ฑฐ๋‚˜ ํ—ˆ๋“ค์„ ๋›ฐ์–ด๋„˜๋Š” ์ž‘์—…์€ ์†ยท๋ฐœยท๋ชธํ†ต์ด ํ•จ๊ป˜ ์›€์ง์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์ œ๋Š” ์ด๋Ÿฐ ํ˜‘์‘ ๋™์ž‘์„ ์–ด๋–ป๊ฒŒ ์–ป๋А๋ƒ์ž…๋‹ˆ๋‹ค.

  • ๋ณด์ƒ ์„ค๊ณ„๋งŒ์œผ๋กœ RL์„ ๋Œ๋ฆฌ๋ฉด, ํƒ์ƒ‰(exploration)์ด ๋น„ํšจ์œจ์ ์ด๋ผ ์†๊ณผ ์ƒ์ฒด๋ฅผ ๊ฑฐ์˜ ์•ˆ ์“ฐ๋Š” ๋‹จ์กฐ๋กœ์šด ๋ณดํ–‰์œผ๋กœ ์ˆ˜๋ ดํ•˜๊ธฐ ์‰ฝ์Šต๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์ธ ๊ตฌ์กฐ์  ๊ฐ€์ด๋“œ๊ฐ€ ์—†์œผ๋ฉด ๋ณต์žกํ•œ ์ง€ํ˜• ๋ŒํŒŒ๊ฐ€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
  • ๋ชจ์…˜ ์ถ”์ (motion tracking) ์€ ์‚ฌ๋žŒ ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋กœ ํ˜‘์‘๋œ ์ „์‹  ์Šคํ‚ฌ์„ ํšจ์œจ์ ์œผ๋กœ ์˜ฎ๊ฒจ์˜ฌ ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ณธ์งˆ์ ์œผ๋กœ ์งœ์—ฌ์ง„ ๊ถค์ ์„ ์žฌ์ƒํ•˜๋Š” ๊ฒƒ์ด๋ผ ๋น„์ •ํ˜•ยท๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ์— ๋Œ€ํ•œ ์ ์‘์„ฑ๊ณผ ๋ฐ˜์‘์„ฑ์ด ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค.

์ €์ž๋“ค์ด ๋˜์ง€๋Š” ์งˆ๋ฌธ์€ ์ด๊ฒƒ์ž…๋‹ˆ๋‹ค. โ€œ์‚ฌ๋žŒ ์ˆ˜์ค€์˜ ์ง€๊ฐ ๊ธฐ๋ฐ˜ ์ฃผํ–‰์„ ํ•˜๋ ค๋ฉด, ์‹ค์‹œ๊ฐ„ ์ง€๊ฐ ์ž…๋ ฅ์— ๋”ฐ๋ผ ์žฅ์• ๋ฌผ๋ณ„๋กœ ์ ์ ˆํ•œ ๋™์ž‘ ์Šคํƒ€์ผ์„ ์กฐํ•ฉยท์กฐ์ •ํ•˜๋Š” ์ƒ์œ„ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด ํ•„์š”ํ•˜์ง€ ์•Š์„๊นŒ?โ€

๊ธฐ์กด์—” ์—ฌ๋Ÿฌ ์ „๋ฌธ๊ฐ€(teacher) ์ •์ฑ…์„ ๋”ฐ๋กœ ํ•™์Šตํ•ด distillation(์ฆ๋ฅ˜) ์œผ๋กœ ํ•ฉ์น˜๋Š” ๋ฐฉ์‹์ด ์ฃผ๋ฅ˜์˜€์ง€๋งŒ, ์ด๋Š” ์ „๋ฌธ๊ฐ€ ๋ฐฐ์ •ยท๋ฐ์ดํ„ฐ ๋ถ„ํฌ ์„ค๊ณ„ ๊ฐ™์€ ์ •๊ตํ•œ ํŒŒ์ดํ”„๋ผ์ธ์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ํ•œํŽธ ์ƒ์„ฑ ๋ชจ๋ธ(diffusion ๋“ฑ)์€ ๋Œ€๊ทœ๋ชจ ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋กœ ์Šค์ผ€์ผ์ด ์ž˜ ๋˜์ง€๋งŒ, ์ƒ์„ฑ๋œ ์šด๋™ํ•™์  ๊ถค์ ์„ ์‹ค์ œ ๋กœ๋ด‡์— ๊ทธ๋Œ€๋กœ ์˜ฌ๋ฆฌ๋ฉด ๋ฐœ ๋ฏธ๋„๋Ÿฌ์ง(foot sliding)ยท์‹œ๊ฐ„์  ๋ถˆ์—ฐ์† ๊ฐ™์€ ์•„ํ‹ฐํŒฉํŠธ๊ฐ€ ์ƒ๊ฒจ ๋„˜์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋…ผ๋ฌธ์˜ ํ•œ ์ค„ ์š”์•ฝ: ์‚ฌ๋žŒ ๋ชจ์…˜์œผ๋กœ ํ•™์Šตํ•œ diffusion ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ๋ฅผ โ€œ์Šคํ‚ฌ ์กฐํ•ฉ ๋ชจ๋“ˆโ€๋กœ ์“ฐ๊ณ , ๊ทธ ์ถœ๋ ฅ์„ RL ๋ชจ์…˜ ์ถ”์ ๊ธฐ๊ฐ€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•˜๊ฒŒ ์‹คํ–‰ํ•˜๊ฒŒ ํ•˜๋ฉฐ, ๋‘˜์„ closed-loop๋กœ ๋ฏธ์„ธ์กฐ์ •ํ•ด ์‹ค์ œ ํœด๋จธ๋…ธ์ด๋“œ์˜ ์ง€๊ฐ ๊ธฐ๋ฐ˜ ์ „์‹  ์ฃผํ–‰์„ ๋‹ฌ์„ฑํ•œ๋‹ค โ€” ๋ฌด๊ฑฐ์šด distillation ์—”์ง€๋‹ˆ์–ด๋ง ์—†์ด.

flowchart LR
    subgraph S1["1 Data Collection & Curation"]
        V[๋ชจ์…˜ ์˜์ƒ<br/>GVHMR ๋ณต์›] --> R[๋กœ๋ด‡์œผ๋กœ retarget<br/>contact-constrained IK]
        M[Mocap ๋ฐ์ดํ„ฐ] --> R
        R --> AUG["Motion Augmentation<br/>์ง€ํ˜• ๋†’์ด ์Šค์ผ€์ผ๋ง<br/>+ ๋žœ๋ค ๋ฐ•์Šค ์‚ฝ์ž…"]
        AUG --> DS[(Motion Dataset<br/>์•ฝ 1์‹œ๊ฐ„)]
    end
    subgraph S2["2 Pre-training"]
        DS --> GEN["Diffusion<br/>Motion Generator"]
        DS --> TRK["RL Motion Tracker<br/>(DeepMimic + PPO)"]
    end
    subgraph S3["3 RL Fine-tuning"]
        GENf["Generator (frozen)"] --> TRKf["Motion Tracker<br/>(RL ์žฌํ•™์Šต)"]
        TRKf -->|robot states<br/>past 2 frames| GENf
        DIR[Direction Command] --> GENf
        HS[Height Scan] --> TRKf
    end
    GEN --> GENf
    TRK --> TRKf

๋ฐฉ๋ฒ•

์ „์ฒด ํ”„๋ ˆ์ž„์›Œํฌ๋Š” 3๋‹จ๊ณ„์ž…๋‹ˆ๋‹ค. (1) ๋‹ค์–‘ํ•œ ์ง€ํ˜•์šฉ ์ „์‹  ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์•„ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•œ ๋กœ๋ด‡ ๊ถค์ ์œผ๋กœ ๋ณ€ํ™˜, (2) ์ด ๋ฐ์ดํ„ฐ๋กœ ๋ชจ์…˜ ์ถ”์ ๊ธฐ์™€ diffusion ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ๋ฅผ ๊ฐ๊ฐ ์‚ฌ์ „ํ•™์Šต, (3) ์ƒ์„ฑ๊ธฐ๋ฅผ ๊ณ ์ •ํ•œ ์ฑ„ ์ถ”์ ๊ธฐ๋ฅผ ๋” ๋‹ค์–‘ํ•œ ์ง€ํ˜•์—์„œ RL๋กœ ๋ฏธ์„ธ์กฐ์ •ํ•˜๊ณ , ๋ฐฐํฌ ์‹œ์—” ์ƒ์„ฑ๊ธฐ๋ฅผ receding-horizon์œผ๋กœ ๋Œ๋ฆฝ๋‹ˆ๋‹ค.

1๋‹จ๊ณ„: ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ํ๋ ˆ์ด์…˜

์ดˆ๊ธฐ ๋ฐ์ดํ„ฐ๋Š” ์•ฝ 5๋ถ„ ๋ถ„๋Ÿ‰์˜ ๋ชจ์…˜ ํด๋ฆฝ์ž…๋‹ˆ๋‹ค. ๋‘ ์ถœ์ฒ˜์—์„œ ๋ชจ์๋‹ˆ๋‹ค.

  • ์ง์ ‘ ์ดฌ์˜ํ•œ ๋ชจ์…˜ ์˜์ƒ: GVHMR ๋กœ raw ์˜์ƒ์—์„œ ์‚ฌ๋žŒ ๋ชจ์…˜์„ ๋ณต์›ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ณต๊ฐœ mocap ๋ฐ์ดํ„ฐ์…‹.

๊ฐ ์ง€ํ˜• ์Šคํ‚ฌ๋งˆ๋‹ค ๋Œ€ํ‘œ ๋™์ž‘ ํ•˜๋‚˜์”ฉ์„ ๋‹ด์Šต๋‹ˆ๋‹ค: 50cm ๋ฐ•์Šค ๊ธฐ์–ด์˜ค๋ฅด๊ธฐ, 35cm ํ—ˆ๋“ค ๋›ฐ์–ด๋„˜๊ธฐ, 50cm ๋ฐ•์Šค์—์„œ ๋›ฐ์–ด๋‚ด๋ฆฌ๊ธฐ, ์•ฝ 20cm ๊ณ„๋‹จ ์˜ค๋ฅด๋‚ด๋ฆฌ๊ธฐ, ๊ทธ๋ฆฌ๊ณ  ์ง์ง„/ํšŒ์ „ ๊ฐ™์€ ์ „๋ฐฉํ–ฅ ๋ณดํ–‰์ž…๋‹ˆ๋‹ค.

๋ณต์›ยท์ˆ˜์ง‘ํ•œ ๋ชจ์…˜์„ contact-constrained IK ์†”๋ฒ„(Drake) ๋กœ ํœด๋จธ๋…ธ์ด๋“œ์— retargetํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๋’ค ์ง€ํ˜• ๋ฌผ์ฒด๋ฅผ ์›๋ž˜ ๋™์ž‘์— ๋งž๊ฒŒ ์ˆ˜๋™ ๋ฐฐ์น˜ํ•ด ๋กœ๋ด‡-ํ™˜๊ฒฝ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ณด์กดํ•ฉ๋‹ˆ๋‹ค. ์ค‘์š”ํ•œ ๋””ํ…Œ์ผ: raw retarget ๊ถค์ ์„ ๊ทธ๋Œ€๋กœ ์“ฐ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋Œ€์‹  ๊ทธ ์œ„์— DeepMimic ์Šคํƒ€์ผ ์ถ”์  ์ •์ฑ…์„ ํ•™์Šต์‹œ์ผœ, ๊ฑฐ๊ธฐ์„œ ๋‚˜์˜จ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•œ ๊ถค์ ์„ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค(์•„ํ‹ฐํŒฉํŠธ ์ œ๊ฑฐ).

Motion Augmentation (๋ชจ์…˜ ์ฆ๊ฐ•). 5๋ถ„์œผ๋กœ๋Š” ๋ถ€์กฑํ•˜๋ฏ€๋กœ ์šด๋™ํ•™ ๊ธฐ๋ฐ˜ ์ฆ๊ฐ•์œผ๋กœ ์•ฝ 1์‹œ๊ฐ„ ๋ถ„๋Ÿ‰๊นŒ์ง€ ํ‚ค์›๋‹ˆ๋‹ค. ๊ธฐ์กด ๋ชจ์…˜์˜ ์ง€ํ˜• ๊ธฐํ•˜๋ฅผ ๋ฐ”๊ฟ”(์žฅ์• ๋ฌผ ๋†’์ด ์Šค์ผ€์ผ๋ง, ๋™์ž‘ ๊ฒฝ๋กœ์— ๋žœ๋ค ๋ฐ•์Šค ์‚ฝ์ž…) ์ƒˆ ๋ชจ์…˜์„ ๋งŒ๋“ค๊ณ , ์ง€ํ˜• ์นจํˆฌ(terrain penetration)ยท๋ชจ์…˜ ๋ถ€๋“œ๋Ÿฌ์›€(smoothness) ์†์‹ค๋กœ ์ตœ์ ํ™”ํ•ด(PARC ๋ฐฉ์‹) ์ถฉ๋Œยท๋ถˆ์—ฐ์†์„ ์ค„์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋„ ์ตœ์ ํ™”๋œ ๊ถค์ ์„ ์ง์ ‘ ์“ฐ์ง€ ์•Š๊ณ  ์ถ”์  ์ •์ฑ…์„ ๋‹ค์‹œ ํ•™์Šต์‹œ์ผœ ๋ฌผ๋ฆฌ์  ํƒ€๋‹น์„ฑ์„ ํ™•๋ณดํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ ๋ฐ์ดํ„ฐ์…‹์€ 35โ€“75cm ๋ฐ•์Šค ๋“ฑ๋ฐ˜/๋›ฐ์–ด๋‚ด๋ฆฌ๊ธฐ, 25โ€“45cm ํ—ˆ๋“ค vaulting, 15โ€“20cm ๊ณ„๋‹จ, ๋žœ๋ค ๋ฐ•์Šค๊ฐ€ ๊น”๋ฆฐ ์ „๋ฐฉํ–ฅ ๋ณดํ–‰์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

2๋‹จ๊ณ„: ์‚ฌ์ „ํ•™์Šต

(A) ์ „์‹  ๋ชจ์…˜ ์ถ”์ ๊ธฐ (Whole-body Motion Tracker). DeepMimic ์Šคํƒ€์ผ RL ์„ PPO(IsaacLab) ๋กœ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ฐฉ ๋ณด์ƒ r_{\text{mimic}} ๊ณผ ์ •๊ทœํ™” ํ•ญ r_{\text{reg}} ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

R_{\text{pre}} = r_{\text{mimic}} + r_{\text{reg}}

  • ๊ด€์ธก(observation): ๋ ˆํผ๋Ÿฐ์Šค ์ƒํƒœ(base frame ๊ธฐ์ค€ ์„ /๊ฐ์†๋„, ๊ด€์ ˆ ์œ„์น˜/์†๋„, ํ•ต์‹ฌ ์‹ ์ฒด ์œ„์น˜) + ์ž๊ธฐ์ˆ˜์šฉ(proprioceptive: base ๊ฐ์†๋„ 5ํ”„๋ ˆ์ž„, ํˆฌ์˜ ์ค‘๋ ฅ, ๊ด€์ ˆ ์œ„์น˜/์†๋„, ์ง์ „ action) + ์ง€ํ˜• ๋†’์ด ์Šค์บ”(height scan).
  • ์ถœ๋ ฅ(action): 23์ฐจ์› ๋ชฉํ‘œ ๊ด€์ ˆ ์œ„์น˜.

๋ชจ๋ฐฉ ๋ณด์ƒ์€ base pose/velocity, ๊ด€์ ˆ ์œ„์น˜/์†๋„, ์‹ ์ฒด ์œ„์น˜ ์ถ”์  ๋“ฑ์„ ์ง€์ˆ˜ ๋ณด์ƒ์œผ๋กœ ๋ฌถ๊ณ , ์ •๊ทœํ™” ํ•ญ์€ action rate, ๊ด€์ ˆ ํ•œ๊ณ„, ํ† ํฌ, ๊ฐ€์†๋„ ๋“ฑ์„ ํŽ˜๋„ํ‹ฐ๋กœ ์ค๋‹ˆ๋‹ค(๋…ผ๋ฌธ Table II). ํฅ๋ฏธ๋กœ์šด ์ : ์‹œ์—ฐ์ด ์ถฉ๋ถ„ํžˆ ์ •ํ™•ํ•˜๋ฉด ์ง€ํ˜• ์ •๋ณด๊ฐ€ ๊ผญ ํ•„์š”ํ•˜์ง„ ์•Š์ง€๋งŒ, ์ด ๋‹จ๊ณ„๋ถ€ํ„ฐ ์™ธ์ˆ˜์šฉ(exteroceptive) ์ž…๋ ฅ์„ ๋„ฃ์–ด๋‘๋ฉด ์ดํ›„ ๋ฏธ์„ธ์กฐ์ •์— ์œ ๋ฆฌํ•˜๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

(B) Diffusion ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ. MDM ์•„ํ‚คํ…์ฒ˜(๋ชจ์…˜ diffusion model) ๊ธฐ๋ฐ˜์œผ๋กœ, ๋ฏธ๋ž˜ ๋ชจ์…˜ ์‹œํ€€์Šค๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

  • ์˜ˆ์ธก ๋Œ€์ƒ: 0.5์ดˆ ์ง€ํ‰์„ (25ํ”„๋ ˆ์ž„)์— ๊ฑธ์ณ root ์œ„์น˜ \mathbb{R}^3, root ๋ฐฉํ–ฅ \mathbb{R}^4, ๊ด€์ ˆ ์œ„์น˜ \mathbb{R}^{23}, ์‹ ์ฒด ๋งํฌ ์œ„์น˜ \mathbb{R}^{23\times3}.
  • ์กฐ๊ฑด(conditioning): ๋ชฉํ‘œ heading ๋ฒกํ„ฐ, ์ง€ํ˜• ๋†’์ด ์Šค์บ”, ์ง์ „ 2ํ”„๋ ˆ์ž„์˜ ๋ชจ์…˜ ํŠน์ง•.
  • ์†์‹ค: ์žฌ๊ตฌ์„ฑ ์†์‹ค + ๊ธฐํ•˜ ์†์‹ค(velocity, joint consistency, terrain penetration). ๋…ธ์ด์ฆˆ๊ฐ€ ๋‚€ ๋กœ๋ด‡ ์ƒํƒœ์— ๊ฐ•๊ฑดํ•˜๋„๋ก height scan๊ณผ ์ง์ „ ์ƒํƒœ๋ฅผ ํ•™์Šต ์ค‘ perturbํ•ฉ๋‹ˆ๋‹ค.

๋‹ค๋งŒ ์‚ฌ์ „ํ•™์Šต๋œ ์ƒ์„ฑ๊ธฐ์™€ ์ถ”์ ๊ธฐ๋ฅผ ๋‹จ์ˆœํžˆ ํ•ฉ์น˜๋ฉด, ์ƒ์„ฑ ๋ชจ์…˜์˜ ์•„ํ‹ฐํŒฉํŠธ ๋•Œ๋ฌธ์— (๊ณ ํ’ˆ์งˆ ๋ ˆํผ๋Ÿฐ์Šค๋กœ๋งŒ ํ•™์Šต๋œ) ์ถ”์ ๊ธฐ๊ฐ€ ์‹คํŒจํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ 3๋‹จ๊ณ„๊ฐ€ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค.

3๋‹จ๊ณ„: RL ๋ฏธ์„ธ์กฐ์ • (ํ•ต์‹ฌ)

์ƒ์„ฑ๊ธฐ๋ฅผ ๊ณ ์ •(frozen) ํ•œ ์ฑ„, ์ถ”์ ๊ธฐ๋งŒ ๋” ๋‹ค์–‘ํ•œ ์ง€ํ˜• + ๋ฌด์ž‘์œ„ ๋ชฉํ‘œ ๋ฐฉํ–ฅ์—์„œ RL๋กœ ๋‹ค์‹œ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ ์ƒ์„ฑ๊ธฐ๋Š” ์ง์ „ 2ํ”„๋ ˆ์ž„์˜ ๋กœ๋ด‡ ์ƒํƒœ๋ฅผ ์กฐ๊ฑด์œผ๋กœ ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ๋งŒ๋“ค์–ด closed-loop ๋ชจ์…˜ ์˜ˆ์ธก์„ ํ˜•์„ฑํ•ฉ๋‹ˆ๋‹ค โ€” ์ฆ‰ ์ž๊ธฐ ์˜ˆ์ธก์„ ์ž๊ฐ€ํšŒ๊ท€(autoregressive)๋กœ ํ”ผ๋“œ๋ฐฑํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ, ์‹ค์ œ ๋กœ๋ด‡ ์ƒํƒœ๋ฅผ ๋ฐ›์•„ ๋ณด์ •ํ•ฉ๋‹ˆ๋‹ค.

  • ์ถ”๋ก  ์ง€์—ฐ ์ ˆ๊ฐ: denoising์„ ๋‹จ 2 ์Šคํ…๋งŒ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ•๊ฑด์„ฑ: ์ƒ์„ฑ ๊ณผ์ •๊ณผ ์ถ”์ ๊ธฐ ๊ด€์ธก์— ์ถ”๊ฐ€ ๋…ธ์ด์ฆˆ๋ฅผ ์ฃผ์ž…ํ•ด ์‹ค์„ธ๊ณ„ ์™ธ๋ž€์— ๋Œ€๋น„ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฐฉํ–ฅ ์ถ”์ข…: heading ์ถ”์  ๋ณด์ƒ r_{\text{task}} ์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

R_{\text{post}} = r_{\text{mimic}} + r_{\text{reg}} + r_{\text{task}}

๋ฏธ์„ธ์กฐ์ • ์ง€ํ˜•์€ ๋” ๋„“์–ด์ง‘๋‹ˆ๋‹ค: 15โ€“25cm ๊ณ„๋‹จ, 25โ€“75cm ํ—ˆ๋“ค, 30โ€“85cm ๋“ฑ๋ฐ˜ ๋ฐ•์Šคยทํ”ผ๋ผ๋ฏธ๋“œ ๊ณ„๋‹จ. ๋ฏธ์„ธ์กฐ์ • ํ›„ ์ถ”์ ๊ธฐ๋Š” ์ผ์ข…์˜ โ€œ๋ชจ์…˜ ํ•„ํ„ฐโ€ ์ฒ˜๋Ÿผ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค โ€” ์ƒ์„ฑ๊ธฐ๊ฐ€ ๋งŒ๋“  ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ๋”ฐ๋ผ๊ฐ€๋˜, ์™ธ์ˆ˜์šฉ ์ง€ํ˜• ๊ด€์ธก์œผ๋กœ ์‹คํ–‰์„ ์กฐ์ •ํ•ด ์œ„ํ—˜ํ•œ ๋™์ž‘์„ ์–ต์ œํ•ฉ๋‹ˆ๋‹ค. denoising์„ 2์Šคํ…์œผ๋กœ ์ค„์—ฌ ์ƒ์„ฑ ํ’ˆ์งˆ์ด ๋‹ค์†Œ ๋–จ์–ด์ ธ๋„ ์ถ”์ ๊ธฐ๊ฐ€ ๊ฐ•๊ฑดํ•˜๊ฒŒ ๋”ฐ๋ผ๊ฐ€, ํ•™์Šต์— ์—†๋˜ ํ–‰๋™(๋ฐ•์Šค ๋ชจ์„œ๋ฆฌ ๋“ฑ๋ฐ˜, ์œ„์—์„œ ๋ฐฉํ–ฅ ์ „ํ™˜ ํ›„ ํ•˜๊ฐ•, ์—ฐ์† vaulting)๊นŒ์ง€ ๊ฐ€๋Šฅํ•ด์ง‘๋‹ˆ๋‹ค.

ํ•˜๋“œ์›จ์–ด ๋ฐฐํฌ

์ „ ํŒŒ์ดํ”„๋ผ์ธ์ด Unitree G1(23-DoF) ์— ์™„์ „ ์˜จ๋ณด๋“œ๋กœ ์˜ฌ๋ผ๊ฐ‘๋‹ˆ๋‹ค.

๊ตฌ์„ฑ ์žฅ๋น„/๋ฐฉ๋ฒ•
Base pose ์ถ”์ • DLIO + Livox MID360 LiDAR + IMU
์ง€ํ˜• ์ธ์ง€ Elevation Mapping CuPy (์ง€ํ˜• ๋†’์ด ๋ณต์›)
์ž์„ธ ๋ณด์ • G1 ๋ชฉ ๊ด€์ ˆ์ด passive โ†’ LiDAR IMU + ๋ชธํ†ต IMU ์œตํ•ฉ์œผ๋กœ head pitch ๋ณด์ƒ
์ƒ์„ฑ๊ธฐ ๊ฐ€์† TensorRT๋กœ ์ถ”๋ก  โ‰ˆ 0.02์ดˆ
๊ฐฑ์‹  ์ฃผ๊ธฐ 0.5์ดˆ ์˜ˆ์ธก(2Hz)์ด์ง€๋งŒ ๋ฐฐํฌ ์‹œ 0.25์ดˆ๋งˆ๋‹ค receding-horizon ์œผ๋กœ ๋ ˆํผ๋Ÿฐ์Šค ๊ฐฑ์‹ 
์—ฐ์‚ฐ ๋ถ„๋‹ด ์ƒ์„ฑ๊ธฐ โ†’ ๋“ฑ์— ์–น์€ Jetson Thor / ์ถ”์ ๊ธฐยท๊ธฐํƒ€ ๋ชจ๋“ˆ โ†’ ๋‚ด์žฅ Jetson Orin

์‹คํ—˜

ํ•˜๋“œ์›จ์–ด ๊ฒฐ๊ณผ (์ •์„ฑ)

์‹ค์ œ G1์—์„œ ๋‹ค์–‘ํ•œ ์ง€ํ˜•์„ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

  • ๋‹ค์žฌ๋‹ค๋Šฅํ•œ ๋ฐ•์Šค ๋“ฑ๋ฐ˜: 75cm ๋ฐ•์Šค์— ์˜ฌ๋ผ๊ฐ€ ์„ธ ๊ฐ€์ง€ ๋ฐฉ์‹์œผ๋กœ ๋›ฐ์–ด๋‚ด๋ฆผ โ€” (a) ์ •๋ฉด ๋“ฑ๋ฐ˜ยทํ•˜๊ฐ•, (b) ์ •๋ฉด ๋“ฑ๋ฐ˜ ํ›„ 90ยฐ ์šฐํšŒ์ „ยท์ธก๋ฉด ํ•˜๊ฐ•, (c) ๋ฐ•์Šค ๋ชจ์„œ๋ฆฌ์—์„œ ๋“ฑ๋ฐ˜ยทํ•˜๊ฐ•. ๋“ฑ๋ฐ˜ ์‹œ ๋ฌด๋ฆŽยท์†์œผ๋กœ ์ง€์ง€, ํ•˜๊ฐ• ์‹œ ์†์œผ๋กœ ์ถฉ๊ฒฉ ํก์ˆ˜(ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์Šคํƒ€์ผ๊ณผ ์ผ์น˜).
  • ๊ณ„๋‹จ & Vaulting: ๊ณ„๋‹จ ์ฃผํŒŒ, ์„œ๋กœ ๋‹ค๋ฅธ ๋†’์ด์˜ ํ—ˆ๋“ค์„ ์—ฐ์†์œผ๋กœ โ€” ๋ณดํ†ต ๊ทธ ์œ„์— ์˜ฌ๋ผํƒ€์ง€ ์•Š๊ณ  ๊ณง์žฅ ๋›ฐ์–ด๋„˜์Œ.
  • Local Navigation: ๋ชฉํ‘œ ๋ฐฉํ–ฅ๋Œ€๋กœ ๊ฐ€๋ฉด ์žฅ์• ๋ฌผ ๋ŒํŒŒ๊ฐ€ ์–ด๋ ค์šด ์ƒํƒœ์ผ ๋•Œ, ์ถ”์  ์ •์ฑ…์ด ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ๋ถ€๋ถ„์ ์œผ๋กœ ๋ฌด์‹œํ•˜๊ณ  ์˜†์œผ๋กœ ์šฐํšŒํ•ด ์‹คํŒจ๋ฅผ ํ”ผํ•˜๋ฉฐ ๋ชฉํ‘œ ๋„๋‹ฌ.
  • ๋ณตํ•ฉ ์ง€ํ˜•(Mixed Terrain): vaulting โ†’ ๊ณ„๋‹จ โ†’ ๋ฐ•์Šค ๋“ฑ๋ฐ˜์„ ํ•œ ๋ฒˆ์˜ ์‹œํ€€์Šค๋กœ ์—ฐ์† ์ˆ˜ํ–‰(์ง€ํ˜•์— ๋”ฐ๋ผ ์Šคํƒ€์ผ ์ „ํ™˜, ์˜ˆ: ์ ํ”„-๋‹ค์šด ํ›„ ๊ณ„๋‹จ ๋“ฑ๋ฐ˜).

์ •๋Ÿ‰ ๊ฒฐ๊ณผ โ‘ : ์˜จ๋ผ์ธ ๋ชจ์…˜ ์ƒ์„ฑ์˜ ํšจ๊ณผ

๊ณ ์ • ๋ ˆํผ๋Ÿฐ์Šค ์ถ”์ (Tracker Only) vs ํ’€ ์‹œ์Šคํ…œ(Tracker + Gen) ์„ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ๊ฐ ์ž‘์—…๋‹น ๋กœ๋ด‡ 500๊ฐœ๋ฅผ ๋ฌด์ž‘์œ„ ์ดˆ๊ธฐ ์ž์„ธ๋กœ ์Šคํฐํ•˜๊ณ , ํ…Œ์ŠคํŠธ ์‹œ์ ์— ์ง€ํ˜• ๋†’์ดยทyaw๋ฅผ ๋ฐ”๊ฟ” ์ผ๋ฐ˜ํ™”๋ฅผ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค. ์„ฑ๊ณต ๊ธฐ์ค€์€ ๋ชฉํ‘œ ์œ„์น˜ ๋„๋‹ฌ์ž…๋‹ˆ๋‹ค.

์ž‘์—… Tracker Only Tracker + Gen
Box Climbing 0.859 ยฑ 0.252 0.987 ยฑ 0.014
Vaulting 0.805 ยฑ 0.231 0.990 ยฑ 0.026
Ascending Stairs 0.845 ยฑ 0.300 0.997 ยฑ 0.005

ํ•ด์„: ๊ณ ์ • ๋ ˆํผ๋Ÿฐ์Šค ์ถ”์ ๊ธฐ๋„ ๋ช…๋ชฉ ์กฐ๊ฑด ๊ทผ์ฒ˜์—์„  ์ž˜ํ•˜์ง€๋งŒ, ์žฅ์• ๋ฌผ ๋†’์ดยท๋ฐฉํ–ฅ์ด ๋ฐ”๋€Œ๋ฉด ๊ธ‰๊ฒฉํžˆ ์ทจ์•ฝํ•ด์ง‘๋‹ˆ๋‹ค(ํŠนํžˆ ํฐ ๋ณ€ํ™”์—์„œ). ํ‰๊ท ์ด ๋†’์•„๋„ ํ‘œ์ค€ํŽธ์ฐจ๊ฐ€ ํฌ๋‹ค๋Š” ์ (ยฑ0.25~0.30)์ด ์ด๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด ํ’€ ์‹œ์Šคํ…œ์€ ๋ชจ๋“  ์„ค์ •์—์„œ ์ผ๊ด€๋˜๊ฒŒ 0.99 ์ˆ˜์ค€์œผ๋กœ ๊ฐ•๊ฑดํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰ ์˜จ๋ผ์ธ ์ƒ์„ฑ ๋ชจ์…˜์ด ์ง€ํ˜•์— ๋งž์ถฐ ๋™์ž‘ ์ž์ฒด(ํƒ€์ด๋ฐยท์Šคํƒ€์ผ)๋ฅผ ๋ฐ”๊ฟ”์ฃผ๋Š” ๊ฒƒ์ด ์ ์‘์˜ ํ•ต์‹ฌ์ž…๋‹ˆ๋‹ค. ๊ณ ์ • ์ถ”์ ๊ธฐ๋Š” height scan์„ ๋ณด๊ธด ํ•˜์ง€๋งŒ, ๊ธฐ๊ปํ•ด์•ผ ์ž‘์€ ๋ถˆ์ผ์น˜๋ฅผ ํก์ˆ˜ํ•˜๋Š” ์ œํ•œ์  ๊ตญ์†Œ ์ผ๋ฐ˜ํ™”์— ๊ทธ์นฉ๋‹ˆ๋‹ค.

์ •๋Ÿ‰ ๊ฒฐ๊ณผ โ‘ก: ์ถ”์ ๊ธฐ ๋ฏธ์„ธ์กฐ์ •์˜ ํšจ๊ณผ

์ƒ์„ฑ๊ธฐ๋Š” ๋™์ผํ•˜๊ฒŒ ๋‘” ์ฑ„, ์‚ฌ์ „ํ•™์Šต ์ถ”์ ๊ธฐ vs ๋ฏธ์„ธ์กฐ์ • ์ถ”์ ๊ธฐ๋ฅผ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค(5๊ฐœ ์ž‘์—…: ๋“ฑ๋ฐ˜ up/down, vaulting, ๊ณ„๋‹จ up/down). ๊ฐ ์ž‘์—…๋‹น ๋กœ๋ด‡ 500๊ฐœ, ์žฅ์• ๋ฌผ ๋’ค์— ๋ชฉํ‘œ๋ฅผ ๋‘๊ณ  ์„ฑ๊ณต๋ฅ  ์ธก์ •.

๊ฒฐ๊ณผ(Fig. 4): ๋ฏธ์„ธ์กฐ์ •์ด ๋ชจ๋“  ์ž‘์—…์—์„œ ์ผ๊ด€๋˜๊ฒŒ ์„ฑ๊ณต๋ฅ ์„ ๋†’์ด๋ฉฐ, ์–ด๋ ค์šด ์ง€ํ˜•์ผ์ˆ˜๋ก ์ด๋“์ด ํฝ๋‹ˆ๋‹ค. ์˜ˆ์ปจ๋Œ€ ๋ฐ•์Šค ๋†’์ด๊ฐ€ ์ปค์งˆ์ˆ˜๋ก(๋“ฑ๋ฐ˜/ํ•˜๊ฐ• ๋ชจ๋‘) ๊ฒฉ์ฐจ๊ฐ€ ๋ฒŒ์–ด์ง‘๋‹ˆ๋‹ค. ์ด์œ ๋กœ ์ €์ž๋“ค์€, ์‚ฌ์ „ํ•™์Šต ์ถ”์ ๊ธฐ๋Š” ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ์ •์ œ๋œ ๊ณ ์ • ์˜คํ”„๋ผ์ธ ๊ถค์ ์œผ๋กœ ํ•™์Šต๋์ง€๋งŒ, ๋ฐฐํฌ ์‹œ ์ƒ์„ฑ๊ธฐ๋Š” ๋…ธ์ด์ฆˆ ๋‚€ ๋กœ๋ด‡ ์ƒํƒœยท๋ชฉํ‘œยท์ง€ํ˜•์œผ๋กœ ์กฐ๊ฑดํ™”๋œ (๋ถˆ์—ฐ์†ยท์•„ํ‹ฐํŒฉํŠธ ์žˆ๋Š”) ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ๋‚ด๋†“์•„ ๋ถ„ํฌ ๋ถˆ์ผ์น˜(distribution mismatch) ๊ฐ€ ์ƒ๊ธด๋‹ค๊ณ  ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ๋ฏธ์„ธ์กฐ์ •์€ ์ถ”์ ๊ธฐ๊ฐ€ ์ด ์ƒ์„ฑ ๋ถ„ํฌ์— ์ ์‘ํ•˜๊ณ , ๋ชจ์…˜ ํŒจํ„ด์„ ์ž‘์—… ์™„์ˆ˜์— ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜๋ฉฐ, ์œ„ํ—˜ ๋™์ž‘์„ ์–ต์ œํ•˜๋„๋ก ๋งŒ๋“ญ๋‹ˆ๋‹ค.

๋น„ํŒ์  ๊ณ ์ฐฐ

๊ฐ•์ 

  • ๋‘ ํŒจ๋Ÿฌ๋‹ค์ž„์˜ ๊น”๋”ํ•œ ๊ฒฐํ•ฉ. โ€œํ˜‘์‘์€ ์ƒ์„ฑ ๋ชจ๋ธ์ด, ๋ฌผ๋ฆฌ์  ํƒ€๋‹น์„ฑ์€ RL ์ถ”์ ๊ธฐ๊ฐ€โ€ ์ฑ…์ž„์ง€๋Š” ์—ญํ•  ๋ถ„๋‹ด์ด ๋ช…ํ™•ํ•ฉ๋‹ˆ๋‹ค. distillation์˜ ์ •๊ตํ•œ ์ „๋ฌธ๊ฐ€ ๋ฐฐ์ •ยท๋ฐ์ดํ„ฐ ์„ค๊ณ„ ๋ถ€๋‹ด์„ ๋œ๋ฉด์„œ, ๋‹ค์–‘ํ•œ ์Šคํ‚ฌ์„ ํ•œ ์‹œ์Šคํ…œ์— ๋‹ด์•˜์Šต๋‹ˆ๋‹ค.
  • ์‹ค์ œ ํœด๋จธ๋…ธ์ด๋“œ ์˜จ๋ณด๋“œ ๋ฐฐํฌ. ์‹œ๋ฎฌ๋ ˆ์ด์…˜์— ๊ทธ์น˜์ง€ ์•Š๊ณ , G1์— LiDARยทelevation mappingยทTensorRTยท์ด์ค‘ Jetson๊นŒ์ง€ ์˜ฌ๋ ค ์™„์ „ ์˜จ๋ณด๋“œ๋กœ ๋ณตํ•ฉ ์ง€ํ˜•์„ ์ฃผํŒŒํ•œ ์ ์ด ์ธ์ƒ์ ์ž…๋‹ˆ๋‹ค. closed-loop generation์„ 2 denoising ์Šคํ… + receding-horizon์œผ๋กœ ์‹ค์‹œ๊ฐ„ํ™”ํ•œ ์—”์ง€๋‹ˆ์–ด๋ง์ด ํ•ต์‹ฌ ๊ธฐ์—ฌ์ž…๋‹ˆ๋‹ค.
  • ๋‘ ์„ค๊ณ„ ์š”์†Œ์˜ ๋…๋ฆฝ์  ๊ฒ€์ฆ. ์˜จ๋ผ์ธ ์ƒ์„ฑ(๊ฒฐ๊ณผ โ‘ )๊ณผ ์ถ”์ ๊ธฐ ๋ฏธ์„ธ์กฐ์ •(๊ฒฐ๊ณผ โ‘ก)์„ ๊ฐ๊ฐ ํ†ต์ œ ๋น„๊ตํ•ด, โ€œ๋‘˜ ๋‹ค ํ•„์š”ํ•˜๋‹คโ€๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ๋ถ„๋ฆฌํ•ด ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ํ‘œ์ค€ํŽธ์ฐจ๋กœ ๊ฐ•๊ฑด์„ฑ์„ ๋“œ๋Ÿฌ๋‚ธ ์ ์ด ์„ค๋“๋ ฅ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์†Œ๋Ÿ‰ ๋ฐ์ดํ„ฐ์—์„œ ์ถœ๋ฐœ. ๋‹จ 5๋ถ„์˜ ๋ชจ์…˜์„ ์šด๋™ํ•™ ์ฆ๊ฐ•์œผ๋กœ 1์‹œ๊ฐ„๊นŒ์ง€ ๋Š˜๋ ค ๋‹ค์–‘ํ•œ ์ง€ํ˜•์„ ์ปค๋ฒ„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ํšจ์œจ ์ธก๋ฉด์˜ ์‹ค์šฉ์  ๋ ˆ์‹œํ”ผ์ž…๋‹ˆ๋‹ค.

์•ฝ์ ๊ณผ ํ•œ๊ณ„

  • ์™ธ์ˆ˜์šฉ ์ธ์ง€ ์˜์กด์„ฑ(์ €์ž๋„ ์ธ์ •). ์ „์ฒด๊ฐ€ LiDAR ๊ธฐ๋ฐ˜ elevation mapping ์— ํฌ๊ฒŒ ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ์„ผ์‹ฑ ๋…ธ์ด์ฆˆ๋กœ ๋งคํ•‘ ํ’ˆ์งˆ์ด ๋–จ์–ด์ง€๋ฉด ์ฃผํ–‰ ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ์•…ํ™”๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €์ž๋“ค์€ neural mapping์ด๋‚˜ belief encoder๋ฅผ ํ›„์† ๋ฐฉํ–ฅ์œผ๋กœ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  • ํ•˜๋“œ์›จ์–ด ํ‰๊ฐ€๊ฐ€ ์ •์„ฑ ์œ„์ฃผ. ์‹ค๋กœ๋ด‡ ๊ฒฐ๊ณผ๋Š” ์‹œ์—ฐ ์˜์ƒยท์Šค๋ƒ…์ƒท ์ค‘์‹ฌ์ด๊ณ , ์ •๋Ÿ‰(์„ฑ๊ณต๋ฅ ) ๋น„๊ต๋Š” ๋Œ€๋ถ€๋ถ„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์ด๋ค„์กŒ์Šต๋‹ˆ๋‹ค. sim-to-real ๊ฐญ์˜ ์ •๋Ÿ‰์  ๋ถ„์„์€ ์ œํ•œ์ ์ž…๋‹ˆ๋‹ค. (์˜์ƒ์— ์•ˆ์ „ ๋ˆ์œผ๋กœ ๋ณด์ด๋Š” ์ค„์ด ์žˆ์–ด, ์‹คํŒจ์œจยท๋‚™์ƒ ๋นˆ๋„ ๋“ฑ ์‹ค์„ธ๊ณ„ ์‹ ๋ขฐ์„ฑ ์ˆ˜์น˜๋Š” ๋” ํ•„์š”ํ•ด ๋ณด์ž…๋‹ˆ๋‹ค โ€” ์ถ”์ธก.)
  • ์ƒ์„ฑ ์ง€์—ฐ vs ํ’ˆ์งˆ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„. 2 denoising ์Šคํ…์œผ๋กœ ์ค„์ด๋ฉด ์ƒ์„ฑ ํ’ˆ์งˆ์ด ๋–จ์–ด์ง€๋Š”๋ฐ, ์ถ”์ ๊ธฐ๊ฐ€ ์ด๋ฅผ ๋ฉ”์šด๋‹ค๊ณ  ํ•˜์ง€๋งŒ ํ’ˆ์งˆ ์ €ํ•˜์˜ ์ •๋Ÿ‰์  ์˜ํ–ฅ(์˜ˆ: ์Šคํ… ์ˆ˜์— ๋”ฐ๋ฅธ ์„ฑ๊ณต๋ฅ  ๊ณก์„ )์€ ์ถฉ๋ถ„ํžˆ ์ œ์‹œ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
  • ๋‹จ์ผ ์ž„๋ฒ ๋””๋จผํŠธ. Unitree G1 ํ•œ ์ข…๋ฅ˜์— ์ง‘์ค‘๋˜์–ด, ๋‹ค๋ฅธ ํœด๋จธ๋…ธ์ด๋“œ๋กœ์˜ ์ผ๋ฐ˜ํ™”๋Š” ๊ฒ€์ฆ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
  • ์ˆœ์ˆ˜ locomotion์— ํ•œ์ •. ์†์œผ๋กœ ๋ฌผ์ฒด๋ฅผ ๋‹ค๋ฃจ๋Š” loco-manipulation์ด๋‚˜ ์•ผ์™ธ ๋น„์ •ํ˜• ํ™˜๊ฒฝ์€ ๋‹ค๋ฃจ์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค(์ €์ž๋„ ํ›„์† ๊ณผ์ œ๋กœ ์–ธ๊ธ‰).

์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

์ด ๋…ผ๋ฌธ์€ ํœด๋จธ๋…ธ์ด๋“œ ์ „์‹  ์ฃผํ–‰์˜ ์˜ค๋žœ ๋”œ๋ ˆ๋งˆ โ€” ๋ณด์ƒ ์„ค๊ณ„ RL์˜ ํ˜‘์‘ ๋ถ€์กฑ vs ๋ชจ์…˜ ์ถ”์ ์˜ ์ ์‘ ๋ถ€์กฑ โ€” ๋ฅผ โ€œ์ƒ์„ฑ + ์ถ”์ โ€์˜ ๊ฒฐํ•ฉ์œผ๋กœ ๊ณต๋žตํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ๋žŒ ๋ชจ์…˜์œผ๋กœ ํ•™์Šตํ•œ diffusion ๋ชจ์…˜ ์ƒ์„ฑ๊ธฐ๊ฐ€ ์ง€ํ˜•ยท๋ฐฉํ–ฅ์— ๋งž๋Š” ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ์‹ค์‹œ๊ฐ„ ์ƒ์„ฑํ•˜๊ณ , RL ๋ชจ์…˜ ์ถ”์ ๊ธฐ๊ฐ€ ์ด๋ฅผ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•˜๊ฒŒ ์‹คํ–‰ํ•˜๋ฉฐ, ์ƒ์„ฑ๊ธฐ๋ฅผ ๊ณ ์ •ํ•œ ์ฑ„ ์ถ”์ ๊ธฐ๋ฅผ closed-loop๋กœ ๋ฏธ์„ธ์กฐ์ •ํ•ด ๋ถˆ์™„์ „ํ•œ ์ƒ์„ฑ์—๋„ ๊ฐ•๊ฑดํ•˜๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์ˆ˜์น˜๋กœ ์ •๋ฆฌํ•˜๋ฉด, ํ’€ ์‹œ์Šคํ…œ์€ ๊ณ ์ • ๋ ˆํผ๋Ÿฐ์Šค ์ถ”์  ๋Œ€๋น„ ๋ฐ•์Šค ๋“ฑ๋ฐ˜ 0.86 โ†’ 0.99, vaulting 0.81 โ†’ 0.99, ๊ณ„๋‹จ ๋“ฑ๋ฐ˜ 0.85 โ†’ 1.00 ์œผ๋กœ ์„ฑ๊ณต๋ฅ ์„ ๋Œ์–ด์˜ฌ๋ ธ๊ณ (ํ‘œ์ค€ํŽธ์ฐจ๋„ ํฌ๊ฒŒ ๊ฐ์†Œ), ๋ฏธ์„ธ์กฐ์ •์€ ์–ด๋ ค์šด ์ง€ํ˜•์ผ์ˆ˜๋ก ํฐ ์ด๋“์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด ๋ชจ๋“  ๊ฒƒ์ด Unitree G1์— ์™„์ „ ์˜จ๋ณด๋“œ๋กœ ์˜ฌ๋ผ๊ฐ€ ๋ฐ•์Šคยทํ—ˆ๋“คยท๊ณ„๋‹จยท๋ณตํ•ฉ ์ง€ํ˜•์„ ์‹ค์ œ๋กœ ์ฃผํŒŒํ–ˆ์Šต๋‹ˆ๋‹ค.

์‹ค๋ฌด ๊ด€์ ์—์„œ ์ด ์—ฐ๊ตฌ์˜ ๊ฐ€์น˜๋Š” โ€œ๋ฌด๊ฑฐ์šด distillation ์—†์ด๋„, ์ƒ์„ฑ ๋ชจ๋ธ์„ ์Šคํ‚ฌ ์กฐํ•ฉ๊ธฐ๋กœ ์“ฐ๊ณ  RL ์ถ”์ ๊ธฐ๋กœ ๋ฌผ๋ฆฌ์  ํƒ€๋‹น์„ฑ์„ ๋ณด์žฅํ•˜๋ฉด ์ง€๊ฐ ๊ธฐ๋ฐ˜ ํœด๋จธ๋…ธ์ด๋“œ ์ฃผํ–‰์ด ๊ฐ€๋Šฅํ•˜๋‹คโ€ ๋Š” ์ฒญ์‚ฌ์ง„์„ ์ œ์‹œํ•œ ๋ฐ ์žˆ์Šต๋‹ˆ๋‹ค. LiDAR ์ธ์ง€ ์˜์กด, ์ •์„ฑ ์œ„์ฃผ์˜ ์‹ค๋กœ๋ด‡ ํ‰๊ฐ€, ๋‹จ์ผ ์ž„๋ฒ ๋””๋จผํŠธ๋ผ๋Š” ํ•œ๊ณ„๋Š” ๋ถ„๋ช…ํ•˜์ง€๋งŒ, ์ƒ์„ฑ-์ถ”์  ๋ถ„๋ฆฌ + closed-loop ๋ฏธ์„ธ์กฐ์ •์ด๋ผ๋Š” ํ‹€์€ ํ–ฅํ›„ ํœด๋จธ๋…ธ์ด๋“œ ์ „์‹  ์ œ์–ด์˜ ์œ ๋ ฅํ•œ ๋ฐฉํ–ฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

Copyright 2026, JungYeon Lee