Curieux.JY
  • JungYeon Lee
  • Post
  • Lecture
  • Note

On this page

  • ๐Ÿ” Ping Review
  • ๐Ÿ”” Ring Review
    • ํ•œ๋ˆˆ์— ๋ณด๊ธฐ
    • ์„œ๋ก : ์™œ ์ด‰๊ฐ์€ sim-to-real์˜ ์ฒœ๋•๊พธ๋Ÿฌ๊ธฐ์˜€๋‚˜
    • ๋ฐฉ๋ฒ•: CoP๋Š” ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค์–ด์ง€๊ณ  ์–ด๋–ป๊ฒŒ ๋ณด์ •๋˜๋Š”๊ฐ€
      • 3.1 CoP๋ผ๋Š” ๋ฌผ๋ฆฌ ํ‘œํ˜„
      • 3.2 ํƒ์…€ โ†”๏ธŽ CoP ๋งคํ•‘: ์‹ค๋ฆฌ์ฝ˜์ด ํž˜์„ ํผ๋œจ๋ฆฐ๋‹ค๋Š” ์‚ฌ์‹ค
      • 3.3 ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜: ํž˜ ์„ผ์„œ ์—†์ด ํƒ์…€ ๋ฐฉํ–ฅ ์ถ”์ •ํ•˜๊ธฐ
      • 3.4 Sim-Real ์ •๋ ฌ์˜ ํ˜„์‹ค์  ๋””ํ…Œ์ผ
    • ์‹คํ—˜: ๋ˆˆ ๊ฐ์€ ์†์ด ๋ชป์„ ๋ผ์šฐ๊ณ  ๊ณต์„ ๊ตด๋ฆฐ๋‹ค
      • ํ•˜๋“œ์›จ์–ด์™€ ๊ณผ์ œ
      • ๊ณผ์ œ 1: Peg-in-Hole ์‚ฝ์ž…
      • ๊ณผ์ œ 2: ๊ณต ๊ท ํ˜• ์žก๊ธฐ (Ball Balancing)
      • ํ•™์Šต๋œ ์ •์ฑ…์ด ๋ฌผ๋ฆฌ๋ฅผ โ€œ์ดํ•ดโ€ํ•˜๋Š”๊ฐ€
    • ๋น„ํŒ์  ๊ณ ์ฐฐ: ๊ฐ•์ ๊ณผ ํ•œ๊ณ„
      • ๊ฐ•์ 
      • ํ•œ๊ณ„ (์ €์ž ์ž์ธ + ์ถ”๊ฐ€ ๊ด€์ฐฐ)
    • ๊ด€๋ จ ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต
    • ์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

๐Ÿ“ƒBeyond Binary

twist
dexterity
teleop
: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation
Published

May 24, 2026

  • Paper
  • Project Page
  1. ๐Ÿค– ์ด ์—ฐ๊ตฌ๋Š” ์ ‘์ด‰ ๊ธฐ๋ฐ˜์˜ ์ •๊ตํ•œ ๋กœ๋ด‡ ์กฐ์ž‘์—์„œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๊ณผ ํ˜„์‹ค ๊ฐ„์˜ ๊ฒฉ์ฐจ๋ฅผ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜์˜ Center-of-Pressure (CoP)๋ผ๋Š” ์ƒˆ๋กœ์šด ์ด‰๊ฐ ํ‘œํ˜„์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
  2. ๐Ÿ› ๏ธ CoP๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด, ๋ณธ ๋…ผ๋ฌธ์€ ์ง€์ƒ ์ง„์‹ค ํž˜ ์ธก์ • ์—†์ด taxel ๋ฐฉํ–ฅ์„ ์ถ”์ •ํ•˜๋Š” ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ์—ญํ•™ ๊ธฐ๋ฐ˜์˜ ์„ผ์„œ calibration ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜๊ณผ ํ•˜๋“œ์›จ์–ด ๊ฐ„์˜ ์ •๋ ฌ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
  3. ๐Ÿš€ CoP๋ฅผ ์‚ฌ์šฉํ•œ ์ •์ฑ…์€ peg-in-hole ๋ฐ ball balancing๊ณผ ๊ฐ™์€ ์–ด๋ ค์šด tasks์—์„œ zero-shot sim-to-real transfer๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ๋‹ฌ์„ฑํ–ˆ์œผ๋ฉฐ, ๊ธฐ์กด baseline๋“ค์„ ๋Šฅ๊ฐ€ํ•˜๊ณ  ๋ฌผ์ฒด ์งˆ๋Ÿ‰๊ณผ ๊ฐ™์€ task ๊ด€๋ จ ๋ฌผ๋ฆฌ์  ์†์„ฑ๋“ค์„ ๋‚ด์žฌ์ ์œผ๋กœ ํ•™์Šตํ•˜๋Š” emergent ๋Šฅ๋ ฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ” Ping Review

๐Ÿ” Ping โ€” A light tap on the surface. Get the gist in seconds.

์ด ๋…ผ๋ฌธ์€ ๋กœ๋ด‡์˜ ์ ‘์ด‰์ด ํ’๋ถ€ํ•œ(contact-rich) ์กฐ์ž‘(manipulation)์—์„œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜-์‹ค์ œ(sim-to-real) ๊ฐ„๊ทน์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ ์ด‰๊ฐ(tactile) ์„ผ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค.

๋ฌธ์ œ์  ๋ฐ ๊ธฐ์—ฌ:

๊ธฐ์กด sim-to-real ๊ฐ•ํ™” ํ•™์Šต(RL)์€ ์‹ค์ œ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘์˜ ์–ด๋ ค์›€๊ณผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜-์‹ค์ œ ๊ฐ„๊ทน(simulation-reality gap) ๋•Œ๋ฌธ์— ์ด‰๊ฐ๊ณผ ๊ฐ™์€ ์ •๋ณด ๋ฐ€๋„๊ฐ€ ๋†’์€ ์–‘์‹์„ ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์–ด๋ ค์› ์Šต๋‹ˆ๋‹ค. ์ด‰๊ฐ ๋ฐ์ดํ„ฐ์˜ ๋ณต์žก์„ฑ ๋•Œ๋ฌธ์— ๋Œ€๋ถ€๋ถ„์€ ๋‹จ์ˆœํ™”๋œ ํ˜•ํƒœ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๊ณ , ์ด๋Š” ๋ณต์žกํ•œ ์กฐ์ž‘์— ํ•„์š”ํ•œ ํ’๋ถ€ํ•œ ์ •๋ณด๋ฅผ ํฌ์ƒํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋‚ณ์•˜์Šต๋‹ˆ๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜์˜(physics-grounded) ์ด‰๊ฐ ํ‘œํ˜„์ธ Center-of-Pressure (CoP)๋ฅผ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค. CoP๋Š” ์กฐ๋ฐ€ํ•œ ์ ‘์ด‰ ์ •๋ณด๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ sim-to-real ์ „์ด(transfer)์— ๊ฐ•์ธ์„ฑ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. CoP๋ฅผ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ๋™์—ญํ•™(differentiable dynamics)์— ๊ธฐ๋ฐ˜ํ•œ ์„ผ์„œ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜(calibration) ๋ฐฉ์‹์„ ์ œ์•ˆํ•˜์—ฌ ์ง€๋ฉด ์ง„์‹ค(ground-truth) ํž˜ ์ธก์ • ์—†์ด taxel(๊ฐœ๋ณ„ ์ด‰๊ฐ ๊ฐ์ง€์ )์˜ ๋ฐฉํ–ฅ์„ ์ถ”์ •ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ๋ฐฉ๋ฒ•๋ก : Center-of-Pressure (CoP) ํ‘œํ˜„ ๋ฐ Taxel-CoP ๋งคํ•‘

  1. CoP์˜ ์ •์˜: CoP๋Š” ๋กœ๋ด‡ ์†๊ฐ€๋ฝ ๋(fingertip)์˜ ์ด‰๊ฐ ์ •๋ณด๋ฅผ ์š”์•ฝํ•˜๋Š” ํ‘œํ˜„์œผ๋กœ, ์„ผ์„œ ํ”„๋ ˆ์ž„(sensor frame, S)์—์„œ ์ •์˜๋˜๋Š” 3D ํž˜ ๋ฒกํ„ฐ S f_{cop} \in \mathbb{R}^3 (์ด ์ ‘์ด‰ ํž˜)์™€ 3D Cartesian ์ ‘์ด‰ ์œ„์น˜ S p_{cop} \in \mathbb{R}^3๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ฒฐ๊ณผ์ ์ธ ์ ‘์ด‰ ๋ Œ์น˜(wrench)๋ฅผ ๋‹จ์ผ ํž˜ ๋ฒกํ„ฐ์™€ ๊ทธ ์ค‘์‹ฌ ์ ‘์ด‰์ ์œผ๋กœ ๋‹จ์ˆœํ™”ํ•œ ๊ทผ์‚ฌ์น˜์ž…๋‹ˆ๋‹ค.

  2. Taxel-CoP ๋งคํ•‘: XELA uSkin ์„ผ์„œ์™€ ์œ ์‚ฌํ•œ ์ด‰๊ฐ ์„ผ์„œ์˜ raw taxel readings(T_i f_i \in \mathbb{R}^3)๋ฅผ CoP ํ‘œํ˜„(S f_{cop}, S p_{cop})์œผ๋กœ ๋งคํ•‘ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ๊ฐ taxel i๋Š” ์„ผ์„œ ํ”„๋ ˆ์ž„ S์— ๋Œ€ํ•ด ๊ณ ์œ ํ•œ ์œ„์น˜ S p_i์™€ ๋ฐฉํ–ฅ R_i \in \text{SO}(3)๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.

    • ์ŠคํŠธ๋ ˆ์Šค ๋ถ„ํฌ ๋ชจ๋ธ(Stress Distribution Model): ๊ธฐ์กด์˜ ๋‹จ์ˆœ ํ•ฉ์‚ฐ ๋ฐฉ์‹์ด ์•„๋‹Œ, compliant silicone ์ธต์„ ํ†ตํ•œ ํž˜ ํ™•์‚ฐ(spreading)์„ ๊ณ ๋ คํ•œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
      • CoP ํž˜ ๋ฒกํ„ฐ f_{cop}๋ฅผ ๋ฒ•์„ (normal) ์„ฑ๋ถ„ f_n๊ณผ ์ „๋‹จ(shear) ์„ฑ๋ถ„ f_s์œผ๋กœ ๋ถ„ํ•ดํ•ฉ๋‹ˆ๋‹ค.
      • ๊ฐ taxel i์— ๋Œ€ํ•œ ์œ ํšจ(effective) ๋ฒ•์„  ํž˜ f_{i,n}๊ณผ ์ „๋‹จ ํž˜ f_{i,s}์„ ๋ชจ๋ธ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋ณ€ํ˜•(deformation)์œผ๋กœ ์ธํ•œ ํž˜ ๋ฐฉํ–ฅ์˜ ๋ณ€ํ™”์™€ ์ ‘์ด‰์ ์œผ๋กœ๋ถ€ํ„ฐ์˜ ๊ฑฐ๋ฆฌ์— ๋น„๋ก€ํ•˜๋Š” ํž˜ ํฌ๊ธฐ ๊ฐ์†Œ๋ฅผ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.
      • p_{cop}๋Š” ํ™œ์„ฑ taxel๋“ค์˜ ์œ„์น˜ p_i์— ๋Œ€ํ•œ ๊ฐ€์ค‘ ํ‰๊ท (\sum_{i \in A} \frac{\left\lVert f_i \right\rVert}{\sum_{j \in A} \left\lVert f_j \right\rVert} p_i)์œผ๋กœ ์ถ”์ •๋ฉ๋‹ˆ๋‹ค.
      • taxel์˜ ๊ตญ๋ถ€ ๋ฒ•์„  ๋ฒกํ„ฐ \hat{n}_i์™€ CoP์—์„œ taxel๋กœ์˜ ์ƒ๋Œ€ ๋ฐฉํ–ฅ ๋ฒกํ„ฐ \hat{v}_i๋ฅผ ๋ธ”๋ Œ๋”ฉํ•˜์—ฌ ์œ ํšจ ๋ฒ•์„  ํž˜ ๋ฐฉํ–ฅ \hat{b}_i = \text{normalize}(w_i \hat{n}_i + (1 - w_i)\hat{v}_i)์„ ๊ทผ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ w_i = \exp(-\left\lVert p_i - p_{cop} \right\rVert^2 / (2\sigma^2))๋Š” ๊ฐ€์šฐ์‹œ์•ˆ ๋ฐฉ์‚ฌํ˜• ๊ฐ€์ค‘์น˜(Gaussian radial weight)์ž…๋‹ˆ๋‹ค.
      • ์ „๋‹จ ํž˜์€ ํ‘œ๋ฉด ์ ‘์„  ํ‰๋ฉด์œผ๋กœ์˜ ํˆฌ์˜(P_{shear} = I_3 - \hat{n}_{cop} \hat{n}_{cop}^T)์„ ํ†ตํ•ด ๊ทผ์‚ฌ๋ฉ๋‹ˆ๋‹ค.
      • ์ตœ์ข…์ ์œผ๋กœ taxel ํž˜ f_i์™€ f_{cop}์˜ ๊ด€๊ณ„๋Š” f_i = M_i f_{cop}์œผ๋กœ ์••์ถ•ํ•˜์—ฌ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ M_i = w_i(\hat{b}_i \hat{n}_{cop}^T + P_{shear})์ž…๋‹ˆ๋‹ค.
    • CoP ํž˜์˜ ๊ณ„์‚ฐ: ๊ด€์ธก๋œ taxel ํž˜ t f_i u๋กœ๋ถ€ํ„ฐ ์•Œ ์ˆ˜ ์—†๋Š” f_{cop}๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด, ๊ฐœ๋ณ„ taxel ๋ฐฉ์ •์‹์„ ์ „์—ญ ์„ ํ˜• ์‹œ์Šคํ…œ A f_{cop} = b๋กœ ํ†ตํ•ฉํ•˜๊ณ , ์ •๊ทœํ™”๋œ(regularized) ์ตœ์†Œ ์ œ๊ณฑ ๋ฌธ์ œ์˜ ํ์‡„ํ˜• ํ•ด(closed-form solution)๋กœ f_{cop}๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค: f_{cop} = (A^T A + \lambda^2 I)^{-1} A^T b ์—ฌ๊ธฐ์„œ A = [M_1^T, \dots, M_N^T]^T, b = [f_1^T, \dots, f_N^T]^T์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ๊ณ„์‚ฐ ํšจ์œจ์ ์ด๊ณ  ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜์—ฌ ๊ทธ๋ž˜๋””์–ธํŠธ ๊ธฐ๋ฐ˜ ํ•™์Šต์— ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ๋™์—ญํ•™ ๊ธฐ๋ฐ˜ ์„ผ์„œ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜(Sensor Calibration via Differentiable Dynamics):

taxel ํ”„๋ ˆ์ž„์˜ ๋ฐฉํ–ฅ R_i๋Š” ์ˆ˜๋™์œผ๋กœ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜ํ•˜๊ธฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ์ด๋ฅผ ์ž๋™์œผ๋กœ ์ถ”์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

  1. ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘: ๋กœ๋ด‡ ์†์˜ ๊ธฐ๋ณธ ํ”„๋ ˆ์ž„ B์—์„œ ์„ผ์„œ ํ”„๋ ˆ์ž„ S๊นŒ์ง€์˜ ์šด๋™ํ•™ ์ฒด์ธ(kinematic chain)์„ ๊ณ ๋ คํ•ฉ๋‹ˆ๋‹ค. ๊ณ ์ •๋œ joint positions๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ fingertip์— ์ž„์˜์˜ ์ ‘์ด‰์„ ๊ฐ€ํ•˜์—ฌ, ์™ธ๋ถ€ ํž˜์— ๊ท ํ˜•์„ ๋งž์ถ”๊ธฐ ์œ„ํ•ด joint actuators๊ฐ€ ํ† ํฌ๋ฅผ ์ ์šฉํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ raw taxel forces T_i f_i, ์ ์šฉ๋œ joint torques \tau \in \mathbb{R}^4, joint angles q \in \mathbb{R}^4๋ฅผ ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์ ‘์ด‰ ์œ„์น˜์™€ ๋ฐฉํ–ฅ์„ ํ†ตํ•ด taxel์˜ ๋ฒ•์„  ๋ฐ ์ „๋‹จ ์‘๋‹ต์„ ๋ชจ๋‘ ํฌ์ฐฉํ•ฉ๋‹ˆ๋‹ค.
  2. ํšŒ์ „ ํŒŒ๋ผ๋ฏธํ„ฐํ™”(Rotation Parameterization): ํ•™์Šต ๊ฐ€๋Šฅํ•œ taxel ํšŒ์ „ R_i๋ฅผ R_9 + \text{SVD} ๋ฐฉ๋ฒ•์œผ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ž„์˜์˜ 3x3 ํ–‰๋ ฌ P \in \mathbb{R}^{3 \times 3}๋ฅผ Singular Value Decomposition (SVD)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์œ ํšจํ•œ ํšŒ์ „ ํ–‰๋ ฌ R \in \text{SO}(3)๋กœ ํˆฌ์˜ํ•ฉ๋‹ˆ๋‹ค: R = \text{SVD}_+(P) = U \text{diag}(1, 1, \text{det}(UV^T))V^T, ์—ฌ๊ธฐ์„œ P = U \Sigma V^T์ž…๋‹ˆ๋‹ค.
  3. ์ตœ์ ํ™”(Optimization): ํ›ˆ๋ จ ์ค‘, ๊ฐ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ์— ๋Œ€ํ•ด ๊ธฐ๋ก๋œ taxel forces T_i f_i๋ฅผ ํ˜„์žฌ์˜ ์ถ”์ •๋œ \hat{R}_i๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ผ์„œ ํ”„๋ ˆ์ž„ S๋กœ ํšŒ์ „์‹œํ‚ต๋‹ˆ๋‹ค. ์ด์–ด์„œ taxel-to-CoP ๋งคํ•‘์„ ์ ์šฉํ•˜์—ฌ ์ถ”์ •๋œ CoP ํž˜ ๋ฒกํ„ฐ S \hat{f}_{cop}์™€ ์ ‘์ด‰ ์œ„์น˜ S \hat{p}_{cop}๋ฅผ ์–ป๊ณ , ์ด๋ฅผ ๊ธฐ๋ก๋œ joint angles q๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ base frame B๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ •์  ํ‰ํ˜•(static equilibrium) ์กฐ๊ฑด ํ•˜์—์„œ joint torques \tau๋Š” ์ ‘์ด‰์ ์—์„œ์˜ ์œ„์น˜ Jacobian J์™€ ์™ธ๋ถ€ ํž˜ f์˜ ๊ด€๊ณ„ \tau = -J^T f๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ์ด๋ฅผ ์ด์šฉํ•˜์—ฌ ์ถ”์ •๋œ CoP ํž˜ ๋ฒกํ„ฐ B \hat{f}_{cop}์— ๋Œ€ํ•œ ์˜ˆ์ƒ joint torques \hat{\tau} = -B \hat{J}_{cop}^T B \hat{f}_{cop}๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ตœ์ข…์ ์œผ๋กœ ์ถ”์ •๋œ \hat{\tau}์™€ ๊ธฐ๋ก๋œ \tau ์‚ฌ์ด์˜ MSE ์†์‹ค(\mathcal{L} = \left\lVert \tau - \hat{\tau} \right\rVert_2^2)์„ ๊ณ„์‚ฐํ•˜๊ณ , ์ด ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ์—ญ์ „ํŒŒํ•˜์—ฌ ํšŒ์ „ ํŒŒ๋ผ๋ฏธํ„ฐ \hat{P}์˜ ์ถ”์ •์น˜๋ฅผ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.

Sim-Real ์ •๋ ฌ(Alignment):

  • ์ ‘์ด‰ ํ‘œํ˜„(Contact Representation): IsaacLab์˜ ContactSensor API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ fingertip๊ณผ ์ ‘์ด‰ ๊ฐ์ฒด ๊ฐ„์˜ ์ ‘์ด‰์„ ์ถ”์ ํ•ฉ๋‹ˆ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ shear components๊ฐ€ ๋ถˆ์•ˆ์ •ํ•˜์—ฌ CoP์˜ ํ‘œ๋ฉด ๋ฒ•์„ (surface-normal) ์„ฑ๋ถ„๋งŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๊ณผ ์‹ค์ œ ๋ชจ๋‘์—์„œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. sim-to-real ๊ฐ„๊ทน ์ตœ์†Œํ™”๋ฅผ ์œ„ํ•ด, taxel-CoP ๋งคํ•‘ ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜๊ณผ ์‹ค์ œ์—์„œ ์ˆ˜์ง‘๋œ paired rollout data๋กœ ํ•œ ๋ฒˆ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜๋ฉ๋‹ˆ๋‹ค.
  • ์•ก์ธ„์—์ดํ„ฐ ๋™์—ญํ•™(Actuator Dynamics): ๋ฏธ๋ฌ˜ํ•œ ์•ก์ธ„์—์ดํ„ฐ ๋™์—ญํ•™(์˜ˆ: ๋น„๊ท ์ผ ๋งˆ์ฐฐ)์„ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์—, ๋ฒ ์ด์‹œ์•ˆ ์ตœ์ ํ™”(Bayesian optimization) ๊ธฐ๋ฐ˜์˜ ์‹œ์Šคํ…œ ์‹๋ณ„(system identification) ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๋œ ์•ก์ธ„์—์ดํ„ฐ ๋™์—ญํ•™์„ ์‹ค์ œ ํ•˜๋“œ์›จ์–ด์— ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค. step inputs, slow ramp inputs, chirp inputs ๋“ฑ ๋‹ค์–‘ํ•œ ์‹œํ€€์Šค๋กœ ๋กœ๋ด‡์„ ์ž‘๋™์‹œ์ผœ ์‘๋‹ต ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ , ์ด๋ฅผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ชจ๋ธ์— ๋งž์ถฐ ์ตœ์ ํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • ์„ผ์„œ ์ง€์—ฐ(Sensor Delay): ์ด‰๊ฐ ์„ผ์„œ์˜ ๋น„๋ฌด์‹œ ๊ฐ€๋Šฅํ•œ ์ง€์—ฐ(non-negligible delay)์„ ๊ณ ๋ คํ•˜์—ฌ, ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ•™์Šต ์ค‘์— ์ด ์ง€์—ฐ์„ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค.

์‹คํ—˜:

16-DOF Allegro Hand์— XELA uSkin ์„ผ์„œ๊ฐ€ ์žฅ์ฐฉ๋œ ํ™˜๊ฒฝ์—์„œ ๋‘ ๊ฐ€์ง€ ๋„์ „์ ์ธ โ€˜๋งน๋ชฉ์ ์ธโ€™(blind) ์ ‘์ด‰ ์ค‘์‹ฌ ์กฐ์ž‘ ์ž‘์—…์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค:

  1. Peg-in-Hole ์‚ฝ์ž…(Insertion): ๋‹ค์–‘ํ•œ ๋ชจ์–‘(์›, ๋‹ค์ด์•„๋ชฌ๋“œ, ํƒ€์›, ์œก๊ฐํ˜•, ์‚ฌ๊ฐํ˜•, ์‚ผ๊ฐํ˜•)์˜ peg์™€ hole ์ž์‚ฐ์„ ์‚ฌ์šฉํ•˜์—ฌ, ์†์ด peg๋ฅผ ์žก๊ณ  hole์— ์™„์ „ํžˆ ์‚ฝ์ž…ํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐ yaw ๋ฐฉํ–ฅ๊ณผ ์œ„์น˜๋Š” ๋ฌด์ž‘์œ„ํ™”๋ฉ๋‹ˆ๋‹ค.
  2. Ball Balancing: ๊ฐ€๋ฒผ์šด ์‚ฌ๊ฐํ˜• ํ”Œ๋ ˆ์ดํŠธ๋ฅผ ๋„ค ์†๊ฐ€๋ฝ์œผ๋กœ ์ง€์ง€ํ•˜๊ณ , ํ”Œ๋ ˆ์ดํŠธ ์œ„์— ๊ณต์„ ์˜ฌ๋ ค ๊ท ํ˜•์„ ์œ ์ง€ํ•˜๊ณ  ์ค‘์•™์— ๋†“์ด๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ํ›ˆ๋ จ์€ ๋ถ€๋“œ๋Ÿฌ์šด ๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ, ํ‰๊ฐ€๋Š” ์งˆ๋Ÿ‰, ํฌ๊ธฐ, ๋งˆ์ฐฐ, ํ‘œ๋ฉด ์งˆ๊ฐ์ด ๋‹ค๋ฅธ ๋„ค ์ข…๋ฅ˜์˜ ๊ณต์œผ๋กœ ์ˆ˜ํ–‰๋ฉ๋‹ˆ๋‹ค.

๋น„๊ต ๋Œ€์ƒ(Baselines):

  • base: proprioception (ํ˜„์žฌ ๋ฐ ๋ช…๋ น๋œ joint angles)๋งŒ ์‚ฌ์šฉ.
  • binary: ์„ผ์‹ฑ ์–ด๋ ˆ์ด ๋‹น ์ด์ง„ ์ ‘์ด‰ ์‹ ํ˜ธ.
  • mag: CoP ํž˜ ํฌ๊ธฐ(magnitude)๋งŒ ์‚ฌ์šฉ.
  • vec: CoP ํž˜ ๋ฒกํ„ฐ(vector)๋งŒ ์‚ฌ์šฉ.
  • pos: CoP ์ ‘์ด‰ ์œ„์น˜(position)๋งŒ ์‚ฌ์šฉ.
  • taxel: Raw taxel forces ์‚ฌ์šฉ.
  • cop (ours): ์ œ์•ˆํ•˜๋Š” CoP ํ‘œํ˜„.
  • human: ์ „๋ฌธ๊ฐ€ ์ธ๊ฐ„์˜ ์„ฑ๋Šฅ.

์ •์ฑ… ์•„ํ‚คํ…์ฒ˜(Policy Architecture):

๋ช…์‹œ์ ์ธ ๊ธฐ๋ก ์Šคํƒ(explicit history stacking) ์—†์ด ์ˆœํ™˜ ์ •์ฑ…(recurrent policy) ์•„ํ‚คํ…์ฒ˜(GRU)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‹œ๊ฐ„์  ๋งฅ๋ฝ(temporal context)์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋” ๋‚˜์€ ์ƒ˜ํ”Œ ํšจ์œจ์„ฑ๊ณผ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ์ •์ฑ… ํ•™์Šต์—๋Š” IsaacLab ๋ฐ ๋น„๋Œ€์นญ ์•กํ„ฐ-ํฌ๋ฆฌํ‹ฑ(asymmetric actor-critic) PPO๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๊ฒฐ๊ณผ ๋ฐ ๋ถ„์„:

  1. Peg-in-Hole ์‚ฝ์ž…:
    • cop๋Š” ๊ฐ€์žฅ ๋†’์€ ์ „๋ฐ˜์ ์ธ ์„ฑ๊ณต๋ฅ ์„ ๋‹ฌ์„ฑํ–ˆ์œผ๋ฉฐ ๋Œ€๋ถ€๋ถ„์˜ ์‚ฝ์ž… ๋ชจ์–‘์—์„œ ๋ชจ๋“  baselines๋ฅผ ๋Šฅ๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.
    • vec, cop์™€ ๊ฐ™์€ ๊ณ ์ •๋ฐ€ ์ ‘์ด‰ ํ‘œํ˜„์€ ๋” ์ ์‘์ ์ด๊ณ  ์ง€์†์ ์ธ ์ •์ฑ…์œผ๋กœ ์ด์–ด์ ธ ๋” ๋†’์€ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์˜€์ง€๋งŒ, ๋‹จ์ˆœํ™”๋œ ํ‘œํ˜„(base, bin)๋ณด๋‹ค ์ž‘์—… ์™„๋ฃŒ ์‹œ๊ฐ„์ด ๊ธธ์—ˆ์Šต๋‹ˆ๋‹ค.
    • taxel์€ ๋‹ค๋ฅธ ๋Œ€๋ถ€๋ถ„์˜ baselines๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋ถˆ์™„์ „ํ•œ ์ด‰๊ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜, ๋†’์€ ์ฐจ์›์„ฑ, ์„ผ์„œ๋ณ„ ๋ถˆ์ผ์น˜ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • Out-of-Distribution (OOD) ์ดˆ๊ธฐํ™” ๊ฐ•๊ฑด์„ฑ: cop ์ •์ฑ…์€ OOD peg pose ์ดˆ๊ธฐํ™”์—์„œ๋„ ๊ฐ€์žฅ ์ ์€ ์„ฑ๊ณต๋ฅ  ๊ฐ์†Œ๋ฅผ ๋ณด์ด๋ฉฐ, ์ง€์†์ ์ธ in-hand object translation๊ณผ ์žฌ์ •๋ ฌ(re-orientation)์„ ํ†ตํ•ด ์ •๋ ฌ์„ ๋‹ฌ์„ฑํ•˜๋Š” emergent ๋Šฅ๋ ฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.
    • Masked Sensor ๊ฐ•๊ฑด์„ฑ: raw hardware taxel forces์˜ 40%๋ฅผ ๋ฌด์ž‘์œ„๋กœ ๋งˆ์Šคํ‚นํ–ˆ์„ ๋•Œ, ๊ณ ์ •๋ฐ€ ์ ‘์ด‰ ํ‘œํ˜„์€ ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹จ์ˆœํ™”๋œ ํ‘œํ˜„๋ณด๋‹ค ๋” ํฐ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ๊ฒช์—ˆ์Šต๋‹ˆ๋‹ค.
  2. Ball Balancing:
    • ์ •ํ™•ํ•œ ํž˜ ์ •๋ณด๊ฐ€ ์ด ์ž‘์—…์— ์ค‘์š”ํ•˜๋ฉฐ, cop, vec, taxel ์ •์ฑ…๋งŒ์ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์ž‘์—…์„ ์„ฑ๊ณต์ ์œผ๋กœ ํ•™์Šตํ–ˆ์Šต๋‹ˆ๋‹ค.
    • cop์™€ vec ์ •์ฑ… ๊ฐ„์˜ ์œ ์‚ฌํ•œ ์‹ค์ œ ์„ฑ๋Šฅ์€ ์ด ์ž‘์—…์— ํž˜ ์ •๋ณด๋งŒ์œผ๋กœ ์ถฉ๋ถ„ํ•  ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
    • ์ •์ฑ…์€ ๊ฐ€์†-๊ฐ์†(accelerate-decelerate) ๊ธฐ๋™๊ณผ ๋А๋ฆฐ ์ค‘์•™ ์ •๋ ฌ(centering) ํ”„๋กœ์„ธ์Šค์™€ ๊ฐ™์€ ๋‘ ๊ฐ€์ง€ ๋šœ๋ ทํ•œ emergent ์›€์ง์ž„ ํŒจํ„ด์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.
    • Object State ์˜ˆ์ธก: ์ •์ฑ… ๋„คํŠธ์›Œํฌ์˜ ์ˆœํ™˜ ๊ณ„์ธต latent output์„ ๋ถ„์„ํ•œ ๊ฒฐ๊ณผ, CoP ์ •๋ณด๊ฐ€ ๊ณต์˜ ์œ„์น˜๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ถ”์ ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜์—ˆ์ง€๋งŒ, ์†๋„ ์˜ˆ์ธก์€ ์ƒ๋Œ€์ ์œผ๋กœ ์•ฝํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์ •์ฑ…์ด ๊ณต์˜ ์œ„์น˜ ์ถ”์ ์„ ์œ„ํ•ด ์ ‘์ด‰ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์ง€๋งŒ, ์šด๋™ ์—ญํ•™(motion dynamics)์„ ์ •๋ฐ€ํ•˜๊ฒŒ ์ธ์ฝ”๋”ฉํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
    • ์•”๋ฌต์  ์งˆ๋Ÿ‰ ์‹๋ณ„(Implicit Mass Identification): ํ›ˆ๋ จ๋œ ์ •์ฑ…์˜ latent representation์ด ๊ณต์˜ ์งˆ๋Ÿ‰๊ณผ ๊ฐ™์€ ๋™์  ํŠน์„ฑ์„ ์•”๋ฌต์ ์œผ๋กœ ํฌ์ฐฉํ•˜๋Š”์ง€ ๋ถ„์„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์„œ๋กœ ๋‹ค๋ฅธ ์งˆ๋Ÿ‰ ๊ฐ’์— ํ•ด๋‹นํ•˜๋Š” ๋šœ๋ ทํ•œ ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ latent embedding์—์„œ ๋‚˜ํƒ€๋‚ฌ์œผ๋ฉฐ, ์ด๋Š” CoP์— ๊ธฐ๋ฐ˜ํ•œ ์ •์ฑ… ์ƒํƒœ๊ฐ€ ๊ฐ์ฒด ์งˆ๋Ÿ‰๊ณผ ๊ฐ™์€ ์ž‘์—… ๊ด€๋ จ ๋ฌผ๋ฆฌ์  ์†์„ฑ์„ ์ค‘์‹ฌ์œผ๋กœ ์กฐ์งํ™”๋จ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

๊ฒฐ๋ก :

์ด ๋…ผ๋ฌธ์€ CoP๊ฐ€ ์‹ค์ œ taxel readings๋ฅผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ ‘์ด‰๋Ÿ‰(contact quantities)๊ณผ ์ •๋ ฌํ•จ์œผ๋กœ์จ ์ด‰๊ฐ sim-to-real ๊ฐ„๊ทน์„ ์ค„์ด๋Š” ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜์˜ ์ด‰๊ฐ ํ‘œํ˜„์ž„์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์‹œ๊ฐ์  ์ž…๋ ฅ์ด ์—†๋Š” ๋™์  โ€˜๋งน๋ชฉ์ ์ธโ€™ ์กฐ์ž‘ ์ž‘์—…์— ๋Œ€ํ•œ ์ฒด๊ณ„์ ์ธ ํ‰๊ฐ€๋ฅผ ํ†ตํ•ด ๊ทธ ํšจ๊ณผ๋ฅผ ์ž…์ฆํ•˜๊ณ , ํ•™์Šต๋œ ์ •์ฑ…์˜ emergent latent representation์— ๋Œ€ํ•œ ํ†ต์ฐฐ๋ ฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜์˜ ์ค‘๊ฐ„ ์ด‰๊ฐ ํ‘œํ˜„(intermediate tactile representations)์ด ์ ‘์ด‰์ด ํ’๋ถ€ํ•œ ์ •๊ตํ•œ ์กฐ์ž‘์„ ์œ„ํ•œ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ sim-to-real ํ•™์Šต์˜ ์œ ๋งํ•œ ๊ฒฝ๋กœ์ž„์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

์ œํ•œ์‚ฌํ•ญ:

  • Fidelity vs. Transferability: CoP๋Š” raw taxel readings๋ฅผ ํž˜ ๋ฐ ์œ„์น˜ ์ •๋ณด๋กœ ์ถ”์ƒํ™”ํ•˜์—ฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜-์‹ค์ œ ๊ฐ„ ์ „์ด์„ฑ์„ ๋†’์˜€์ง€๋งŒ, ์ผ๋ถ€ ์„ผ์„œ๋ณ„ ์„ธ๋ถ€ ์ •๋ณด๋Š” ์†์‹ค๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Sim-Real Contact ๋ถˆ์ผ์น˜: ํ˜„์žฌ ๊ตฌํ˜„์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๋œ ์ „๋‹จ๋ ฅ ์ถ”์ •์น˜๊ฐ€ ์‹ ๋ขฐํ•  ์ˆ˜ ์—†์—ˆ๊ธฐ ๋•Œ๋ฌธ์— CoP ํž˜ ๋ฒกํ„ฐ๋ฅผ ํ‘œ๋ฉด ๋ฒ•์„  ๋ฐฉํ–ฅ์œผ๋กœ ์ œํ•œํ–ˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋Š” ์ž‘์—… ๊ฐ์ฒด์™€์˜ ์ ‘์ด‰๋งŒ ๋ณด๊ณ ํ•˜์ง€๋งŒ ์‹ค์ œ ์„ผ์„œ๋Š” ์ž๊ฐ€ ์ถฉ๋Œ(self-collisions) ๋ฐ ํ™˜๊ฒฝ ์ƒํ˜ธ ์ž‘์šฉ์„ ํฌํ•จํ•œ ๋ชจ๋“  ์ ‘์ด‰์— ๋ฐ˜์‘ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฒ”์œ„ ๋ฐ ํ–ฅํ›„ ๋ฐฉํ–ฅ: ์ด ์—ฐ๊ตฌ๋Š” XELA uSkin ์„ผ์„œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ณ ์ •๋œ dexterous hand์— ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค. CoP๋ฅผ arm-hand ์‹œ์Šคํ…œ, ์ „์ฒด ์† ์ด‰๊ฐ ๋ฒ”์œ„, ๋‹ค๋ฅธ ์ด‰๊ฐ ์„ผ์„œ ์œ ํ˜•์œผ๋กœ ํ™•์žฅํ•˜๋Š” ๊ฒƒ์€ ํ–ฅํ›„ ์—ฐ๊ตฌ ๊ณผ์ œ์ž…๋‹ˆ๋‹ค.

๐Ÿ”” Ring Review

๐Ÿ”” Ring โ€” An idea that echoes. Grasp the core and its value.

ํ•œ ์ค„๋กœ ์š”์•ฝํ•˜๋ฉด, ์ด ๋…ผ๋ฌธ์€ ์ด‰๊ฐ sim-to-real์ด ์˜ค๋žซ๋™์•ˆ ๊ฐ‡ํ˜€ ์žˆ๋˜ โ€œ๊ฑฐ์น ์ง€๋งŒ ์•ˆ์ „ํ•œ ํ‘œํ˜„ vs ํ’๋ถ€ํ•˜์ง€๋งŒ ๋ถˆ์•ˆ์ •ํ•œ ํ‘œํ˜„โ€์˜ ์–‘์žํƒ์ผ์„, ๋ฌผ๋ฆฌ๋Ÿ‰์œผ๋กœ ํ™˜์›๋œ ์ค‘๊ฐ„ ํ‘œํ˜„(Center-of-Pressure, CoP) ์œผ๋กœ ํ’€์–ด๋‚ธ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ teacher-student distillation ์—†์ด ๊ณง๋ฐ”๋กœ sim-to-real์ด ๋˜๊ณ , ์ด์ง„ ์ ‘์ด‰(binary)์ด๋‚˜ ์›์‹œ ํƒ์…€(raw taxel) ํ‘œํ˜„์„ ๋ชจ๋‘ ์•ž์„ ๋‹ค.

๋…ผ๋ฌธ: Beyond Binary: Sim-to-Real Dexterous Manipulation with Physics-Grounded Contact Representation (Pan, Coros, Malik, Lin / ETH Zurich, UC Berkeley / arXiv:2605.28812v1, 2026)

ํ•œ๋ˆˆ์— ๋ณด๊ธฐ

  • ๋ฌธ์ œ: ์†๊ฐ€๋ฝ ๋ ์ด‰๊ฐ์€ contact-rich ์กฐ์ž‘์— ๊ฒฐ์ •์ ์ธ๋ฐ, ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์ด‰๊ฐ์„ ์ •ํ™•ํžˆ ๋ชจ์‚ฌํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. ๊ทธ๋ž˜์„œ ๋Œ€๋ถ€๋ถ„์˜ sim-to-real ์—ฐ๊ตฌ๊ฐ€ ์ด‰๊ฐ์„ ์ด์ง„ ์‹ ํ˜ธ(๋‹ฟ์•˜๋‹ค/์•ˆ ๋‹ฟ์•˜๋‹ค)๋กœ ๋ญ‰๊ฐœ๋ฒ„๋ฆฌ๊ฑฐ๋‚˜, ์•„๋‹ˆ๋ฉด ์›์‹œ ์‹ ํ˜ธ๋ฅผ ๊ทธ๋Œ€๋กœ ์“ฐ๋˜ sim๊ณผ real์ด ์•ˆ ๋งž์•„ ๋ณ„๋„์˜ distillation์— ์˜์กดํ–ˆ๋‹ค.
  • ์ œ์•ˆ: ์ ‘์ด‰์„ 3D ํž˜ ๋ฒกํ„ฐ + 3D ์ ‘์ด‰ ์œ„์น˜๋ผ๋Š” 6์ฐจ์› ๋ฌผ๋ฆฌ๋Ÿ‰(CoP)์œผ๋กœ ์š”์•ฝํ•œ๋‹ค. ์ด๊ฑด IsaacSimยทMuJoCo ๊ฐ™์€ ๊ฐ•์ฒด ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๊ฐ€ ์ด๋ฏธ ๋‚ด๋†“๋Š” ์ ‘์ด‰ ์ •๋ณด์™€ ๊ฐ™์€ ํ˜•ํƒœ๋ผ์„œ, sim๊ณผ real์„ ๊ฐ™์€ ์ขŒํ‘œ๊ณ„ ์œ„์— ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค.
  • ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜์˜ ๋ฌ˜์ˆ˜: ๊ณ ๊ฐ€์˜ ํž˜ ์„ผ์„œ(ground-truth force) ์—†์ด, ์ •์  ํ‰ํ˜• ์ƒํƒœ์˜ ๊ด€์ ˆ ํ† ํฌ๋งŒ์œผ๋กœ ํƒ์…€ ๋ฐฉํ–ฅ์„ ์ถ”์ •ํ•œ๋‹ค. ๋ฏธ๋ถ„๊ฐ€๋Šฅ ๋™์—ญํ•™์„ ์—ญ์ „ํŒŒํ•ด์„œ ํ‘ผ๋‹ค.
  • ๊ฒ€์ฆ: ์‹œ๊ฐ์„ ๊ฑฐ์˜ ์•ˆ ์“ฐ๋Š” โ€œ๋ˆˆ ๊ฐ์€(blind)โ€ ๊ณผ์ œ ๋‘ ๊ฐœ(peg-in-hole ์‚ฝ์ž…, ๊ณต ๊ท ํ˜• ์žก๊ธฐ)์—์„œ zero-shot sim-to-real ์„ฑ๊ณต. CoP๊ฐ€ binary์™€ raw taxel์„ ๋ชจ๋‘ ์•ž์„ ๋‹ค.
  • ํฅ๋ฏธ๋กœ์šด ๋ถ€์‚ฐ๋ฌผ: CoP๋กœ ํ•™์Šตํ•œ ์ •์ฑ…์˜ ์ž ์žฌ ์ƒํƒœ(latent)๋ฅผ ๋“ค์—ฌ๋‹ค๋ณด๋‹ˆ, ๋ˆ„๊ฐ€ ๊ฐ€๋ฅด์ณ์ฃผ์ง€ ์•Š์•˜๋Š”๋ฐ๋„ ๊ณต์˜ ์งˆ๋Ÿ‰ ๊ฐ™์€ ๋ฌผ๋ฆฌ ์†์„ฑ์ด ๊ตฐ์ง‘์œผ๋กœ ๋– ์˜ค๋ฅธ๋‹ค.

์„œ๋ก : ์™œ ์ด‰๊ฐ์€ sim-to-real์˜ ์ฒœ๋•๊พธ๋Ÿฌ๊ธฐ์˜€๋‚˜

์†์œผ๋กœ ํ•˜๋Š” ์ •๊ตํ•œ ์กฐ์ž‘(dexterous manipulation)์€ ๊ฒฐ๊ตญ ์ ‘์ด‰์˜ ๊ฒŒ์ž„์ด๋‹ค. ๋ชป์„ ๊ตฌ๋ฉ์— ๋ผ์šฐ๋“ , ๋ณ‘๋šœ๊ป‘์„ ๋Œ๋ฆฌ๋“ , ์†๊ฐ€๋ฝ ๋์—์„œ ์ผ์–ด๋‚˜๋Š” ๋ฏธ์„ธํ•œ ํž˜๊ณผ ๋ฏธ๋„๋Ÿฌ์ง์„ ์ฝ๊ณ  ๊ฑฐ๊ธฐ์— ๋ฐ˜์‘ํ•˜๋Š” ๋Šฅ๋ ฅ์ด ์„ฑํŒจ๋ฅผ ๊ฐ€๋ฅธ๋‹ค. ๋ฌธ์ œ๋Š” ๋ฐ์ดํ„ฐ๋‹ค. ์‹ค์ œ ๋กœ๋ด‡์œผ๋กœ ์ด๋Ÿฐ ์ ‘์ด‰ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์œผ๋Š” ๋น„์šฉ์ด ๋„ˆ๋ฌด ํฌ๋‹ค. ์‚ฌ๋žŒ์ด ์ผ์ผ์ด teleoperation์œผ๋กœ ์‹œ์—ฐ์„ ์Œ“์•„์•ผ ํ•˜๊ณ , ๊ฒฌ๊ณ ํ•จ(robustness)๊นŒ์ง€ ํ™•๋ณดํ•˜๋ ค๋ฉด ๊ทธ ์–‘์ด ํญ๋ฐœ์ ์œผ๋กœ ๋Š˜์–ด๋‚œ๋‹ค.

sim-to-real ๊ฐ•ํ™”ํ•™์Šต์€ ์ด ๋น„์šฉ ๋ฌธ์ œ์˜ ๊ฐ€์žฅ ๋งค๋ ฅ์ ์ธ ์šฐํšŒ๋กœ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์•ˆ์—์„œ ์ˆ˜์ฒœ ๊ฐœ์˜ ์†์„ ๋ณ‘๋ ฌ๋กœ ๋Œ๋ฆฌ๋ฉฐ ์ •์ฑ…์„ ์ž์œจ ํ•™์Šต์‹œํ‚ค๊ณ , ๊ทธ๊ฑธ ์‹ค์ œ ํ•˜๋“œ์›จ์–ด๋กœ ์˜ฎ๊ธฐ๋ฉด ๋œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ์ง€๊ธˆ๊นŒ์ง€ ์ด ์ ‘๊ทผ์˜ ์„ฑ๊ณต์€ ๋น„๊ต์  ๋‹จ์ˆœํ•œ ๊ณผ์ œ์— ๋จธ๋ฌผ๋ €๋‹ค. ์ €์ฐจ์›์ด๊ณ  sim๊ณผ real์ด ์ž˜ ์ผ์น˜ํ•˜๋Š” ์ž…๋ ฅ(๊ด€์ ˆ ๊ฐ๋„, ๋ฌผ์ฒด ์ž์„ธ ๋“ฑ)์œผ๋กœ ํ’€ ์ˆ˜ ์žˆ๋Š” ๊ณผ์ œ๋“ค ๋ง์ด๋‹ค.

์ด‰๊ฐ์€ ๋Š˜ ๋น ์ง€๊ฑฐ๋‚˜, ๋น ์ง€์ง€ ์•Š๋”๋ผ๋„ ์‹ฌํ•˜๊ฒŒ ๋‹จ์ˆœํ™”๋๋‹ค. ์ด์œ ๋Š” ๋ถ„๋ช…ํ•˜๋‹ค. ์ด‰๊ฐ ์„ผ์„œ์˜ ์‘๋‹ต์€ ์‹ค๋ฆฌ์ฝ˜ ๋ณ€ํ˜•, ๋งˆ์ฐฐ, ์ ‘์ด‰ ๊ธฐํ•˜ ๊ฐ™์€ ๊ณ ์ฐจ์›์ ์ด๊ณ  ๋ชจ๋ธ๋ง๋˜์ง€ ์•Š์€ ๋ฌผ๋ฆฌ ๊ณผ์ •์— ๋‹ฌ๋ ค ์žˆ๊ณ , ์„ผ์„œ๋งˆ๋‹ค ์„ค๊ณ„๊ฐ€ ์ œ๊ฐ๊ฐ์ด๋ผ ํ‘œ์ค€์ด ์—†๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๊ฐ€ ์ด๊ฑธ ์ถฉ์‹คํžˆ ์žฌํ˜„ํ•˜๊ธฐ๋ž€ ์‚ฌ์‹ค์ƒ ๋ถˆ๊ฐ€๋Šฅ์— ๊ฐ€๊น๋‹ค.

์—ฌ๊ธฐ์„œ ์ €์ž๋“ค์ด ์งš๋Š” ํ•ต์‹ฌ์€ ํ‘œํ˜„์˜ trade-off ๋‹ค. ํ•œ์ชฝ ๋์—๋Š” ์ด์ง„ยท์‚ผ์ง„ ์ ‘์ด‰ ์‹ ํ˜ธ์ฒ˜๋Ÿผ ๋‹จ์ˆœํ•œ ํ‘œํ˜„์ด ์žˆ๋‹ค. sim๊ณผ real์ด ๊ฑฐ์˜ ๋˜‘๊ฐ™์ด ๋ณด์ด๋‹ˆ ์ „์ด๋Š” ์•ˆ์ •์ ์ด๋‹ค. ๋Œ€์‹  โ€œ์–ด๋А ์†๊ฐ€๋ฝ์ด ๋‹ฟ์•˜๋‹คโ€ ์ •๋„๋งŒ ์•Œ ๋ฟ, ์–ผ๋งˆ๋‚˜ ์„ธ๊ฒŒ, ์ •ํ™•ํžˆ ์–ด๋””๋ฅผ ๋ˆŒ๋ €๋Š”์ง€๋Š” ๋ฒ„๋ฆฐ๋‹ค. ๋ฐ˜๋Œ€์ชฝ ๋์—๋Š” ์›์‹œ ํƒ์…€ ์‹ ํ˜ธ์ฒ˜๋Ÿผ ์ •๋ณด๊ฐ€ ํ’๋ถ€ํ•œ ํ‘œํ˜„์ด ์žˆ๋‹ค. ๋ณต์žกํ•œ ์กฐ์ž‘์— ํ•„์š”ํ•œ ๋””ํ…Œ์ผ์„ ๋‹ค ๋‹ด์ง€๋งŒ, sim๊ณผ real์˜ ๋ชจ์–‘์ด ๋‹ฌ๋ผ ๊ทธ๋Œ€๋กœ๋Š” ์ „์ด๊ฐ€ ์•ˆ ๋œ๋‹ค.

flowchart LR
  A["Binary / Ternary contact<br/>(coarse, low-dim)"] -->|"robust transfer<br/>but blind"| B
  C["Raw taxel signals<br/>(rich, high-dim)"] -->|"detailed<br/>but brittle"| B
  B["CoP<br/>3D force + 3D position<br/>(physics-grounded middle)"]
  style B fill:#2d6cdf,stroke:#1a3d7a,color:#fff

CoP๋Š” ์ด ๋‘˜ ์‚ฌ์ด์˜ ์ •ํ™•ํžˆ ์ค‘๊ฐ„ ์ง€์ ์„ ๋…ธ๋ฆฐ๋‹ค. ์ ‘์ด‰์„ 3D ํž˜ ๋ฒกํ„ฐ(์–ด๋А ๋ฐฉํ–ฅ์œผ๋กœ ์–ผ๋งˆ๋‚˜)์™€ 3D ์ ‘์ด‰ ์œ„์น˜(์†๊ฐ€๋ฝ ๋ ์–ด๋””์—)๋กœ ์š”์•ฝํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ถฉ๋ถ„ํžˆ ์••์ถ•์ ์ด๋ผ sim๊ณผ real์„ ๋งž์ถœ ์ˆ˜ ์žˆ๊ณ , ๋™์‹œ์— ํž˜๊ณผ ์œ„์น˜๋ผ๋Š” ํ•ต์‹ฌ ์ •๋ณด๋ฅผ ๋ณด์กดํ•  ๋งŒํผ ํ‘œํ˜„๋ ฅ์ด ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด ํ‘œํ˜„์ด ํŠน๋ณ„ํ•œ ์ด์œ ๋Š”, ๊ทธ๊ฒƒ์ด ์ž„์˜๋กœ ๊ณ ๋ฅธ feature๊ฐ€ ์•„๋‹ˆ๋ผ ๊ฐ•์ฒด ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๊ฐ€ ์ด๋ฏธ ๊ณ„์‚ฐํ•ด์„œ ๋‚ด๋†“๋Š” ๋ฌผ๋ฆฌ๋Ÿ‰๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋ผ๋Š” ๋ฐ ์žˆ๋‹ค. IsaacSim์ด๋‚˜ MuJoCo๋Š” ๋ฌผ์ฒด ์Œ ์‚ฌ์ด์˜ ์ ‘์ด‰ ํž˜ ๋ฒกํ„ฐ์™€ ์ ‘์ด‰ ์œ„์น˜๋ฅผ ๊ธฐ๋ณธ์œผ๋กœ ์ œ๊ณตํ•œ๋‹ค. CoP๋Š” ๊ฑฐ๊ธฐ์— ์ง์ ‘ ์˜ฌ๋ผํƒ„๋‹ค.

์ง๊ด€: ์†๋์— ์ˆ˜์‹ญ ๊ฐœ์˜ ์••๋ ฅ ์„ผ์„œ๊ฐ€ ๋ฐ•ํ˜€ ์žˆ๋‹ค๊ณ  ํ•˜์ž. ๊ทธ ํ”ฝ์…€ ๊ฐ™์€ ์‹ ํ˜ธ ํ•˜๋‚˜ํ•˜๋‚˜๋ฅผ ํ†ต์งธ๋กœ ์ •์ฑ…์— ๋„˜๊ธฐ๋Š” ๊ฑด, ์นด๋ฉ”๋ผ RAW ๋ฐ์ดํ„ฐ๋ฅผ ๊ทธ๋Œ€๋กœ ๋˜์ง€๋Š” ๊ฒƒ๊ณผ ๋น„์Šทํ•˜๋‹ค. ์ •๋ณด๋Š” ๋งŽ์ง€๋งŒ sim๊ณผ real์˜ ๋…ธ์ด์ฆˆ ํŒจํ„ด์ด ๋‹ค๋ฅด๋ฉด ์ •์ฑ…์ด ํ—›๊ฒƒ์„ ํ•™์Šตํ•œ๋‹ค. CoP๋Š” ๊ทธ ํ”ฝ์…€๋“ค์„ โ€œํž˜์ด ํ‰๊ท ์ ์œผ๋กœ ์–ด๋””์—, ์–ด๋А ๋ฐฉํ–ฅ์œผ๋กœ ์ž‘์šฉํ–ˆ๋‚˜โ€๋ผ๋Š” ํ•˜๋‚˜์˜ ํ™”์‚ดํ‘œ๋กœ ์••์ถ•ํ•œ๋‹ค. ๋ฌผ๋ฆฌํ•™์ž๊ฐ€ ๋ถ„์‚ฐ๋œ ์••๋ ฅ ๋ถ„ํฌ๋ฅผ ํ•˜๋‚˜์˜ ํ•ฉ๋ ฅ(resultant force)๊ณผ ์ž‘์šฉ์ ์œผ๋กœ ์ค„์—ฌ ๋งํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์€ ๋ฐœ์ƒ์ด๋‹ค.

๋…ผ๋ฌธ ์ œ๋ชฉ์˜ โ€œBeyond Binary(์ด์ง„์„ ๋„˜์–ด์„œ)โ€๋Š” ๋ฐ”๋กœ ์ด ์ง€์ ์„ ๊ฐ€๋ฆฌํ‚จ๋‹ค. ๋‹ฟ์•˜๋ƒ ์•ˆ ๋‹ฟ์•˜๋ƒ๋Š” 1๋น„ํŠธ์งœ๋ฆฌ ์‹ ํ˜ธ๋ฅผ ๋ฒ„๋ฆฌ๋˜, ๊ทธ๋ ‡๋‹ค๊ณ  ์›์‹œ ์‹ ํ˜ธ์˜ ๋Šช์— ๋น ์ง€์ง€๋„ ์•Š๋Š”, ๋ฌผ๋ฆฌ์— ๋ฟŒ๋ฆฌ๋‚ด๋ฆฐ ์ค‘๊ฐ„ ํ‘œํ˜„์œผ๋กœ ๊ฐ€์ž๋Š” ์„ ์–ธ์ด๋‹ค.

์„œ๋ก ์˜ ๋˜ ๋‹ค๋ฅธ ๊ธฐ์—ฌ๋Š” ํ‰๊ฐ€ ๊ณผ์ œ์˜ ์„ค๊ณ„๋‹ค. ๊ธฐ์กด ์ด‰๊ฐ ์กฐ์ž‘ ํ‰๊ฐ€๋Š” ๋‘ ๊ฐ€์ง€ ํ•จ์ •์ด ์žˆ์—ˆ๋‹ค. ์ •์ฑ…์ด ์‚ฌ์‹ค ์‹œ๊ฐ์— ํฌ๊ฒŒ ์˜์กดํ•˜๊ฑฐ๋‚˜, ๊ณผ์ œ๊ฐ€ ๋‹จ์ˆœ ๋ฐ˜๋ณต(in-hand rotation ๊ฐ™์€ ์ฃผ๊ธฐ์  ๋™์ž‘)์ด๋ผ ์ด‰๊ฐ์˜ ์—ญํ• ์„ ๋ถ„๋ฆฌํ•ด๋‚ด๊ธฐ ์–ด๋ ค์› ๋‹ค. ์ €์ž๋“ค์€ ์‹œ๊ฐ ๋‹จ์„œ๋ฅผ ์ตœ์†Œํ™”ํ•œ โ€œblindโ€ ๊ณผ์ œ ๋‘ ๊ฐœ๋ฅผ ์ƒˆ๋กœ ๋งŒ๋“ค์–ด ์ด‰๊ฐ์˜ ๊ธฐ์—ฌ๋ฅผ ๊นจ๋—ํ•˜๊ฒŒ ๋“œ๋Ÿฌ๋‚ด๋ ค ํ•œ๋‹ค.


๋ฐฉ๋ฒ•: CoP๋Š” ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค์–ด์ง€๊ณ  ์–ด๋–ป๊ฒŒ ๋ณด์ •๋˜๋Š”๊ฐ€

๋ฐฉ๋ฒ•๋ก ์€ ํฌ๊ฒŒ ์„ธ ๋ฉ์–ด๋ฆฌ๋‹ค. (1) CoP๋ผ๋Š” ํ‘œํ˜„ ์ž์ฒด์˜ ์ •์˜, (2) ์›์‹œ ํƒ์…€ โ†”๏ธŽ CoP ์‚ฌ์ด์˜ ๋ฏธ๋ถ„๊ฐ€๋Šฅ ์–‘๋ฐฉํ–ฅ ๋งคํ•‘, (3) ํž˜ ์„ผ์„œ ์—†์ด ํƒ์…€ ๋ฐฉํ–ฅ์„ ์ถ”์ •ํ•˜๋Š” ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜. ์—ฌ๊ธฐ์— sim-real ์ •๋ ฌ ๋””ํ…Œ์ผ์ด ๋ถ™๋Š”๋‹ค.

3.1 CoP๋ผ๋Š” ๋ฌผ๋ฆฌ ํ‘œํ˜„

๋‘ฅ๊ทผ ์†๊ฐ€๋ฝ ๋(round fingertip) ์œ„์—์„œ CoP๋Š” ๋‘ ๊ฐœ์˜ ์–‘์œผ๋กœ ์ •์˜๋œ๋‹ค.

  • ์ ‘์ด‰ ํž˜ ๋ฒกํ„ฐ ^{S}f_{cop} \in \mathbb{R}^3: ๊ทธ ๋งํฌ์— ์ž‘์šฉํ•˜๋Š” ์ด ์ ‘์ด‰ ํž˜
  • ์ ‘์ด‰ ์œ„์น˜ ^{S}p_{cop} \in \mathbb{R}^3: ๊ทธ ํž˜์ด ์ž‘์šฉํ•˜๋Š” ๋Œ€ํ‘œ ์ง€์ 

๋‘˜ ๋‹ค ์„ผ์„œ ์ขŒํ‘œ๊ณ„ S ์—์„œ ํ‘œํ˜„๋œ๋‹ค. ํ•ต์‹ฌ์€, ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๊ฐ€ ๋‚ด๋†“๋Š” ์ ‘์ด‰ ์ •๋ณด(world frame์˜ ํž˜ ๋ฒกํ„ฐ์™€ ์œ„์น˜)๋ฅผ ์ขŒํ‘œ ๋ณ€ํ™˜๋งŒ ํ•ด์ฃผ๋ฉด ๊ณง์žฅ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ด๋‹ค. ๊ทธ๋ž˜์„œ sim๊ณผ real์ด โ€œ๊ฐ™์€ ์–ธ์–ดโ€๋กœ ์ ‘์ด‰์„ ๋งํ•˜๊ฒŒ ๋œ๋‹ค.

๊ฐ์ฃผ์—์„œ ์ €์ž๋“ค์ด ์†”์งํ•˜๊ฒŒ ๋ฐํžˆ๋Š” ๋ถ€๋ถ„์ด ์žˆ๋‹ค. CoP๋Š” ์ž„์˜์˜ ๋‹ค์ ‘์ด‰ ์••๋ ฅ ๋ถ„ํฌ๋ฅผ ์™„๋ฒฝํžˆ ํ‘œํ˜„ํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ, ํ•ฉ๋ ฅ ๋ Œ์น˜(resultant wrench)๋ฅผ ํ•˜๋‚˜์˜ ํž˜ ๋ฒกํ„ฐ์™€ ์ค‘์‹ฌ ์ ‘์ด‰์ ์œผ๋กœ ํ™˜์›ํ•œ ๊ทผ์‚ฌ ๋ผ๋Š” ๊ฒƒ. ์ฆ‰ ์˜๋„์ ์œผ๋กœ ๋‹จ์ˆœํ™”ํ•œ ๊ตญ์†Œ ์ ‘์ด‰ ๊ธฐ์ˆ ์ž(local contact descriptor)๋‹ค.

3.2 ํƒ์…€ โ†”๏ธŽ CoP ๋งคํ•‘: ์‹ค๋ฆฌ์ฝ˜์ด ํž˜์„ ํผ๋œจ๋ฆฐ๋‹ค๋Š” ์‚ฌ์‹ค

์—ฌ๊ธฐ๊ฐ€ ์ด ๋…ผ๋ฌธ์—์„œ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ๊ฐ€์žฅ ๋ง›์žˆ๋Š” ๋ถ€๋ถ„์ด๋‹ค. XELA uSkin ๊ฐ™์€ ์„ผ์„œ๋Š” ๊ฒฉ์ž ๋ชจ์–‘์œผ๋กœ ๋ฐฐ์—ด๋œ N ๊ฐœ์˜ ํƒ์…€(taxel)๋กœ ์ด๋ค„์ง„๋‹ค. ๊ฐ ํƒ์…€์€ 3์ถ• ํž˜์„ ์ธก์ •ํ•˜๋Š” ์  ์„ผ์„œ๋‹ค. i ๋ฒˆ์งธ ํƒ์…€์˜ ์œ„์น˜ ^{S}p_i ์™€ ๋ฐฉํ–ฅ R_i \in SO(3) ๊ฐ€ ์„ผ์„œ ์ขŒํ‘œ๊ณ„ ๊ธฐ์ค€์œผ๋กœ ์ฃผ์–ด์ง„๋‹ค.

์ˆœ์ง„ํ•œ ๋ฐฉ๋ฒ•์€ ํƒ์…€ ํž˜๋“ค์„ ๊ทธ๋ƒฅ ๋”ํ•˜๊ฑฐ๋‚˜ ํ‰๊ท  ๋‚ด๋Š” ๊ฒƒ์ด๋‹ค. ํ•˜์ง€๋งŒ ์ด๊ฑด ํ‹€๋ฆฐ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์„ผ์„œ ์œ„๋ฅผ ๋ฎ์€ ์œ ์—ฐํ•œ ์‹ค๋ฆฌ์ฝ˜ ์ธต์ด ํž˜์„ ์˜†์œผ๋กœ ํผ๋œจ๋ฆฌ๊ธฐ(spreading) ๋•Œ๋ฌธ์ด๋‹ค. ํ•œ ์ ์„ ๋ˆŒ๋Ÿฌ๋„ ์ฃผ๋ณ€ ํƒ์…€๋“ค์ด ๊ฐ™์ด ๋ฐ˜์‘ํ•œ๋‹ค. ๊ทธ๋ž˜์„œ ๋‹จ์ˆœ ํ•ฉ์‚ฐ์€ ํ•ฉ๋ ฅ๊ณผ ์ ‘์ด‰ ์œ„์น˜๋ฅผ ๋ชจ๋‘ ํŽธํ–ฅ๋˜๊ฒŒ ์ถ”์ •ํ•œ๋‹ค.

์ €์ž๋“ค์˜ ์‘๋ ฅ ๋ถ„ํฌ ๋ชจ๋ธ(stress distribution model)์€ ์ด ํผ์ง์„ ๋‘ ๊ฐ€์ง€ ๋ฌผ๋ฆฌ ํšจ๊ณผ๋กœ ๋ชจ๋ธ๋งํ•œ๋‹ค.

  1. ๋ฐฉํ–ฅ์˜ ๋ณ€ํ™”: ๋ณ€ํ˜• ๋•Œ๋ฌธ์— ํž˜์˜ ๋ฐฉํ–ฅ์ด ์ ‘์ด‰์ ์—์„œ ๋ฉ€์–ด์งˆ์ˆ˜๋ก ํœœ๋‹ค.
  2. ํฌ๊ธฐ์˜ ๊ฐ์‡ : ์ ‘์ด‰์ ์—์„œ ๋ฉ€์–ด์งˆ์ˆ˜๋ก ํž˜ ํฌ๊ธฐ๊ฐ€ ์ค„์–ด๋“ ๋‹ค.

CoP ํž˜ f_{cop} ๋ฅผ ํ‘œ๋ฉด ๋ฒ•์„  ์„ฑ๋ถ„ f_n ๊ณผ ์ „๋‹จ ์„ฑ๋ถ„ f_s ๋กœ ๋‚˜๋ˆˆ ๋’ค, ๊ฐ ํƒ์…€ i ์— ์ž‘์šฉํ•˜๋Š” ์œ ํšจ ๋ฒ•์„ ๋ ฅ f_{i,n} ๊ณผ ์œ ํšจ ์ „๋‹จ๋ ฅ f_{i,s} ๋ฅผ ๋ชจ๋ธ๋งํ•œ๋‹ค.

Forward mapping (CoP โ†’ ํƒ์…€) ์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด:

  • ์ ‘์ด‰ ์œ„์น˜ p_{cop} ๋Š” ์ธก์ •๋œ ํž˜ ํฌ๊ธฐ \|f_i\| ๋กœ ๊ฐ€์ค‘ํ•œ ํƒ์…€ ์œ„์น˜๋“ค์˜ ๊ฐ€์ค‘ ํ‰๊ท . ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ๋ฅผ ์œ„ํ•ด ์ž„๊ณ„๊ฐ’ \epsilon ์„ ๋„˜๋Š” ํ™œ์„ฑ ํƒ์…€ ์ง‘ํ•ฉ \mathcal{A} ๋งŒ ์“ด๋‹ค.
  • ๊ณก๋ฉด ๊ธฐํ•˜๋ฅผ ๊ณ ๋ คํ•ด, CoP ์œ„์น˜์—์„œ์˜ ๋ฒ•์„  \hat{n}_{cop} ์„ ํƒ์…€ ๋ฒ•์„ ๋“ค์˜ inverse distance weighting์œผ๋กœ ๊ทผ์‚ฌํ•œ๋‹ค.
  • ๋ณ€ํ˜• ํ•˜์˜ ์‘๋ ฅ ๋ฐฉํ–ฅ์€ โ€œblendedโ€ ๋‹จ์œ„ ๋ฒกํ„ฐ \hat{b}_i ๋กœ ๊ทผ์‚ฌํ•œ๋‹ค. ์ด๊ฑด ํƒ์…€ ์ž์‹ ์˜ ๋ฒ•์„  \hat{n}_i ์™€, CoP์—์„œ ํƒ์…€๋กœ ํ–ฅํ•˜๋Š” ๋ฐฉํ–ฅ \hat{v}_i ๋ฅผ Gaussian ๊ฐ€์ค‘์น˜ w_i ๋กœ ์„ž์€ ๊ฒƒ์ด๋‹ค.

\hat{b}_i = \text{normalize}(w_i \hat{n}_i + (1-w_i)\hat{v}_i), \qquad w_i = \exp\!\left(-\frac{\|p_i - p_{cop}\|^2}{2\sigma^2}\right)

์—ฌ๊ธฐ์„œ \sigma ๋Š” ํผ์ง ์ •๋„๋ฅผ ์กฐ์ ˆํ•˜๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋‹ค. w_i ๊ฐ€ Gaussian์ด๋ผ๋Š” ๊ฑด โ€œ์ ‘์ด‰์  ๊ฐ€๊นŒ์šด ํƒ์…€์€ ํ‘œ๋ฉด ๋ฒ•์„  ๋ฐฉํ–ฅ์„, ๋จผ ํƒ์…€์€ ๋ฐฉ์‚ฌํ˜• ํผ์ง ๋ฐฉํ–ฅ์„ ๋” ๋”ฐ๋ฅธ๋‹คโ€๋Š” ์ง๊ด€์„ ์ˆ˜์‹์œผ๋กœ ์˜ฎ๊ธด ๊ฒƒ์ด๋‹ค.

์ „๋‹จ๋ ฅ์€ ํ‘œ๋ฉด ์ ‘ํ‰๋ฉด์œผ๋กœ ํˆฌ์˜ํ•˜๋Š” ์‚ฌ์˜ ํ–‰๋ ฌ๋กœ ์ฒ˜๋ฆฌํ•œ๋‹ค.

P_{shear} = I_3 - \hat{n}_{cop}\hat{n}_{cop}^\top \in \mathbb{R}^{3\times 3}

์ด ๋ชจ๋“  ๊ฑธ ํ•ฉ์น˜๋ฉด, ๊ฐ ํƒ์…€ ํž˜๊ณผ CoP ํž˜์˜ ๊ด€๊ณ„๊ฐ€ ์„ ํ˜• ์‚ฌ์ƒ ํ•˜๋‚˜๋กœ ๊น”๋”ํ•˜๊ฒŒ ์ •๋ฆฌ๋œ๋‹ค.

f_i = M_i f_{cop}, \qquad M_i = w_i(\hat{b}_i \hat{n}_{cop}^\top + P_{shear}) \in \mathbb{R}^{3\times 3}

์™œ ์ด๊ฒŒ ์˜๋ฆฌํ•œ๊ฐ€: M_i ๋Š” ๊ฑฐ๋ฆฌ ์˜์กด์  ํผ์ง์ด๋ผ๋Š” ์ง€๋ฐฐ์  ํšจ๊ณผ๋งŒ ์žก๋Š” ์ €(ไฝŽ)ํŒŒ๋ผ๋ฏธํ„ฐ ๋ชจ๋ธ์ด๋ฉด์„œ, ๋™์‹œ์— ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•˜๋‹ค. ๋ฏธ๋ถ„๊ฐ€๋Šฅ์„ฑ์ด ๋’ค์˜ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜์„ ๊ฐ€๋Šฅ์ผ€ ํ•˜๋Š” ์—ด์‡ ๋‹ค.

Inverse mapping (ํƒ์…€ โ†’ CoP) ์€ ๊ด€์ธก๋œ ํ™œ์„ฑ ํƒ์…€ ํž˜ \{f_i\} ๋กœ๋ถ€ํ„ฐ ๋ฏธ์ง€์˜ f_{cop} ๋ฅผ ํ‘ธ๋Š” ๋ฌธ์ œ๋‹ค. ๊ฐœ๋ณ„ ํƒ์…€ ์‹๋“ค์„ ์ „์—ญ ์„ ํ˜•๊ณ„ A f_{cop} = b ๋กœ ๋ชจ์€ ๋’ค, ์ •๊ทœํ™” ์ตœ์†Œ์ œ๊ณฑ(regularized least-squares)์˜ ๋‹ซํžŒ ํ•ด๋กœ ๊ตฌํ•œ๋‹ค.

f_{cop} = (A^\top A + \lambda^2 I)^{-1} A^\top b, \qquad A = [M_1^\top, \dots, M_N^\top]^\top, \quad b = [f_1^\top, \dots, f_N^\top]^\top

๋‹ซํžŒ ํ•ด๋ผ์„œ ๊ณ„์‚ฐ์ด ๊ฐ€๋ณ๊ณ , ์–‘๋ฐฉํ–ฅ(forward/inverse)์ด ๊ฐ™์€ ๋ฏธ๋ถ„๊ฐ€๋Šฅ ๋ชจ๋ธ๋กœ ๋Œ์•„๊ฐ€๋ฏ€๋กœ sim์˜ CoP โ†”๏ธŽ ํ•˜๋“œ์›จ์–ด ํƒ์…€ ์‚ฌ์ด๋ฅผ ์‹ค์šฉ์ ์œผ๋กœ ์˜ค๊ฐˆ ์ˆ˜ ์žˆ๋‹ค. ์ด๊ฒŒ โ€œ์ •๋ ฌ(alignment)โ€์˜ ๋ฌผ๋ฆฌ์  ํ† ๋Œ€๋‹ค.

flowchart LR
  subgraph SIM["Simulation (IsaacLab)"]
    S1["Pairwise contact<br/>force + position"] --> S2["CoP (sim)"]
  end
  subgraph REAL["Hardware (XELA uSkin)"]
    R1["Raw taxel forces<br/>f_i, 3-axis"] --> R2["Inverse map<br/>least-squares"] --> R3["CoP (real)"]
  end
  S2 -.->|"aligned, same form"| R3
  R3 --> POL["Policy observation"]
  S2 --> POL
  style S2 fill:#2d6cdf,stroke:#1a3d7a,color:#fff
  style R3 fill:#2d6cdf,stroke:#1a3d7a,color:#fff

3.3 ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜: ํž˜ ์„ผ์„œ ์—†์ด ํƒ์…€ ๋ฐฉํ–ฅ ์ถ”์ •ํ•˜๊ธฐ

๋งคํ•‘์—๋Š” ํ•œ ๊ฐ€์ง€ ๋ฏธ์ง€์ˆ˜๊ฐ€ ์žˆ๋‹ค. ํƒ์…€์˜ ์œ„์น˜ ^{S}p_i ๋Š” ์„ผ์„œ ์‚ฌ์–‘์„œ์—์„œ ๋ฐ”๋กœ ์–ป์ง€๋งŒ, ๊ฐ ํƒ์…€์ด ๊ณก๋ฉด ์œ„์—์„œ ์–ด๋А ๋ฐฉํ–ฅ์„ ๋ณด๊ณ  ์žˆ๋Š”์ง€ (R_i) ๋Š” ์†๊ฐ€๋ฝ ๋ ๊ธฐํ•˜๊ฐ€ ๋ณต์žกํ•ด์„œ ์ˆ˜๋™์œผ๋กœ ๋ณด์ •ํ•˜๊ธฐ ์–ด๋ ต๋‹ค.

๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์€ ๋Œ€๊ฐœ ๊ณ ์ •๋ฐ€ ํž˜ ์„ผ์„œ๋กœ ground-truth ํž˜์„ ์ธก์ •ํ•ด ๋ณด์ •ํ–ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์˜ ์šฐ์•„ํ•œ ์ ์€ ๊ทธ๋Ÿฐ ๋ณ„๋„ ์žฅ๋น„ ์—†์ด, ๋กœ๋ด‡ ์ž์‹ ์˜ ๊ด€์ ˆ ํ† ํฌ๋งŒ์œผ๋กœ ํ‘ผ๋‹ค๋Š” ๋ฐ ์žˆ๋‹ค. ๋ฐœ์ƒ์€ ์ •์—ญํ•™(statics)์—์„œ ์˜จ๋‹ค.

์†๊ฐ€๋ฝ์— ์™ธ๋ ฅ์ด ์ž‘์šฉํ•˜๋ฉด, ๊ทธ๊ฑธ ๋ฒ„ํ…จ ์ •์  ํ‰ํ˜•์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ๊ด€์ ˆ ์•ก์ถ”์—์ดํ„ฐ๊ฐ€ ํ† ํฌ๋ฅผ ๋‚ธ๋‹ค. ์ •์  ํ‰ํ˜•์—์„œ ๊ด€์ ˆ ํ† ํฌ \tau ์™€ ์™ธ๋ ฅ f ์˜ ๊ด€๊ณ„๋Š” ์œ„์น˜ ์ž์ฝ”๋น„์•ˆ J ๋กœ ๋ฌถ์ธ๋‹ค(์ค‘๋ ฅ ๋ณด์ƒํ•ญ์€ ๋ฌด์‹œ).

\tau = -J^\top f + g(q) \approx -J^\top f

๊ทธ๋Ÿฌ๋ฉด ์ถ”์ •๋œ CoP ํž˜์œผ๋กœ๋ถ€ํ„ฐ โ€œํ•„์š”ํ–ˆ์„ ๊ด€์ ˆ ํ† ํฌโ€ \hat{\tau} ๋ฅผ ์—ญ์œผ๋กœ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

\hat{\tau} = -\,{}^{B}\hat{J}_{cop}^\top \, {}^{B}\hat{f}_{cop}

์ด์ œ ํ•™์Šต ๋ฃจํ”„๋Š” ๋‹จ์ˆœํ•ด์ง„๋‹ค. ์‹ค์ œ๋กœ ์ธก์ •๋œ ํ† ํฌ \tau ์™€ ์ถ”์ • ํ† ํฌ \hat{\tau} ์‚ฌ์ด์˜ MSE ์†์‹ค์„ ๋งŒ๋“ค๊ณ , ์ด๊ฑธ ํƒ์…€ ํšŒ์ „ ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋Œ€ํ•ด ์—ญ์ „ํŒŒํ•œ๋‹ค.

\mathcal{L} = \|\tau - \hat{\tau}\|_2^2

flowchart TD
  A["Apply random contacts<br/>on fingertip"] --> B["Record: taxel f_i, joint torque tau, joint angle q"]
  B --> C["Rotate taxel forces by current R_i estimate"]
  C --> D["Taxel-to-CoP map -> f_cop, p_cop"]
  D --> E["Forward kinematics + position Jacobian"]
  E --> F["Predicted torque tau_hat = -J^T f_cop"]
  F --> G["Loss = ||tau - tau_hat||^2"]
  G -->|"backprop through<br/>differentiable dynamics"| C
  style G fill:#d62d2d,stroke:#7a1a1a,color:#fff

ํšŒ์ „ ํŒŒ๋ผ๋ฏธํ„ฐํ™”๋Š” SO(3) ํ•™์Šต์—์„œ ๊ถŒ์žฅ๋˜๋Š” R9+SVD ๋ฐฉ์‹์„ ์“ด๋‹ค. ์ž„์˜์˜ 3\times 3 ํ–‰๋ ฌ P ๋ฅผ SVD๋กœ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์œ ํšจ ํšŒ์ „ ํ–‰๋ ฌ๋กœ ์‚ฌ์˜ํ•œ๋‹ค.

R = \text{SVD}^+(P) = U\,\text{diag}(1,1,\det(UV^\top))\,V^\top, \qquad P = U\Sigma V^\top

์™œ ์ •์—ญํ•™์ธ๊ฐ€: ๋™์—ญํ•™์„ ํ’€๋ ค๋ฉด ๊ฐ€์†๋„ยท๊ด€์„ฑ์„ ๋‹ค ์•Œ์•„์•ผ ํ•˜์ง€๋งŒ, ์ •์  ํ‰ํ˜•์€ โ€œํž˜๊ณผ ํ† ํฌ๊ฐ€ ๊ท ํ˜•์„ ์ด๋ฃฌ๋‹คโ€๋Š” ํ•œ ์ค„์งœ๋ฆฌ ์ œ์•ฝ๋งŒ ์“ด๋‹ค. ๋กœ๋ด‡์„ ๊ฐ€๋งŒํžˆ ๋ˆŒ๋Ÿฌ์„œ ๋ฒ„ํ‹ฐ๊ฒŒ ํ•˜๋ฉด, ๊ทธ ๋ฒ„ํ‹ฐ๋Š” ํ† ํฌ๊ฐ€ ๊ณง ์™ธ๋ ฅ์˜ ๊ทธ๋ฆผ์ž๋‹ค. ๊ทธ ๊ทธ๋ฆผ์ž๋ฅผ ๋ณด๊ณ  ํƒ์…€์ด ์–ด๋А ์ชฝ์„ ํ–ฅํ•˜๋Š”์ง€ ๊ฑฐ๊พธ๋กœ ์•Œ์•„๋‚ธ๋‹ค. ์žฅ๋น„๋Š” ๋กœ๋ด‡ ์ž์‹ ์˜ ํ† ํฌ ์„ผ์„œ๋ฉด ์ถฉ๋ถ„ํ•˜๋‹ค.

๋ถ€๋ก A์— ๋”ฐ๋ฅด๋ฉด ๋ฐ์ดํ„ฐ๋Š” ๋‹จ 2๋ถ„(20Hz๋กœ 2400 ์ƒ˜ํ”Œ)์ด๋ฉด ๋˜๊ณ , Adam์œผ๋กœ 100 ์Šคํ…๋งŒ ๋Œ๋ฆฐ๋‹ค. ๊ฒŒ๋‹ค๊ฐ€ ์ด๋ก ์ ์œผ๋กœ๋Š” ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ์‹ ๊ฒฝ๋ง ๊ฐ™์€ ์ž„์˜์˜ ๋งคํ•‘ ํ•จ์ˆ˜๋„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ๋ฐํžŒ๋‹ค. ์ฆ‰ ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ CoP์—๋งŒ ๋ฌถ์ด์ง€ ์•Š๋Š”๋‹ค.

3.4 Sim-Real ์ •๋ ฌ์˜ ํ˜„์‹ค์  ๋””ํ…Œ์ผ

๋…ผ๋ฌธ์ด ์†”์งํ•œ ์ง€์ ๋“ค์ด๋‹ค.

  • ์ „๋‹จ ์„ฑ๋ถ„ ํฌ๊ธฐ: IsaacLab์˜ ContactSensor๊ฐ€ ์ด ์†๊ฐ€๋ฝ ๋ ๊ธฐํ•˜์—์„œ ์ „๋‹จ(shear) ์„ฑ๋ถ„์„ ์‹ ๋ขฐํ•  ์ˆ˜ ์—†๊ฒŒ ์ถ”์ •ํ–ˆ๋‹ค. ๊ทธ๋ž˜์„œ sim๊ณผ real ๋ชจ๋‘์—์„œ ํ‘œ๋ฉด ๋ฒ•์„  ์„ฑ๋ถ„๋งŒ ์“ฐ๊ธฐ๋กœ ํ•œ๋‹ค. ์ •๋ณด๋ฅผ ์ผ๋ถ€ ํฌ์ƒํ•˜๋”๋ผ๋„ sim-to-real ๊ฒฌ๊ณ ํ•จ์„ ํƒํ•œ ๊ฒƒ์ด๋‹ค.
  • ์•ก์ถ”์—์ดํ„ฐ ๋™์—ญํ•™: ๋น„๊ท ์ผ ๋งˆ์ฐฐ ๊ฐ™์€ ๋ฏธ๋ฌ˜ํ•œ ์•ก์ถ”์—์ดํ„ฐ ๊ฑฐ๋™์€ ํ•ด์„์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์–ด๋ ค์šฐ๋ฏ€๋กœ, Bayesian optimization ๊ธฐ๋ฐ˜ system identification์œผ๋กœ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ์˜ stiffnessยทdampingยทjoint friction์„ ์‹ค์ œ์™€ ๋งž์ถ˜๋‹ค(๋ถ€๋ก C). stepยทrampยทchirp ์ž…๋ ฅ์œผ๋กœ ์‘๋‹ต์„ ํƒ์นจํ•œ๋‹ค.
  • ์„ผ์„œ ์ง€์—ฐ: ์ด‰๊ฐ ์„ผ์„œ๋Š” ์ ‘์ด‰ ์‹œ์ ๋ถ€ํ„ฐ ์ •์ฑ…์ด ๊ด€์ธก์„ ๋ฐ›๊ธฐ๊นŒ์ง€ ๋ฌด์‹œ ๋ชป ํ•  ์ง€์—ฐ์ด ์žˆ๋‹ค. ๋™์  ๊ณผ์ œ์—์„  ์น˜๋ช…์ ์ด๋ผ, ๋น„์ „ ๊ธฐ๋ฐ˜์œผ๋กœ ์ง€์—ฐ์„ ์ธก์ •ํ•ด ํ•™์Šต ์ค‘ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์— ์ฃผ์ž…ํ•œ๋‹ค.

์ด ์ •๋ ฌ ๋•๋ถ„์— ์–ป๋Š” ๊ฐ€์žฅ ํฐ ์‹ค์šฉ์  ์ด๋“: teacher-student distillation์ด ํ•„์š” ์—†๋‹ค. ๊ธฐ์กด ์ด‰๊ฐ sim-to-real์€ sim๊ณผ real์˜ ์ด‰๊ฐ ๊ด€์ธก์ด ์•ˆ ๋งž์œผ๋‹ˆ, ํŠน๊ถŒ ์ •๋ณด๋ฅผ ๋ณธ teacher๋ฅผ sim์—์„œ ํ•™์Šตํ•˜๊ณ  ๊ทธ๊ฑธ real ๊ด€์ธก๋งŒ ๋ณด๋Š” student๋กœ ์ฆ๋ฅ˜ํ•˜๋Š” 2๋‹จ๊ณ„๋ฅผ ๊ฑฐ์ณค๋‹ค. CoP๋Š” sim๊ณผ real์ด ๊ฐ™์€ ํ˜•ํƒœ๋ผ ์ง์ ‘ ์ „์ด(direct transfer) ๊ฐ€ ๋œ๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ์ด ํ•œ ๋‹จ๊ณ„ ์งง์•„์ง„๋‹ค.

flowchart LR
  subgraph PRIOR["Prior tactile sim-to-real"]
    direction TB
    T1["Teacher (privileged, sim)"] --> T2["Student distillation"] --> T3["Deploy"]
  end
  subgraph THIS["This work (CoP)"]
    direction TB
    C1["Train on aligned CoP (sim)"] --> C3["Deploy directly"]
  end
  style C1 fill:#2d6cdf,stroke:#1a3d7a,color:#fff


์‹คํ—˜: ๋ˆˆ ๊ฐ์€ ์†์ด ๋ชป์„ ๋ผ์šฐ๊ณ  ๊ณต์„ ๊ตด๋ฆฐ๋‹ค

ํ•˜๋“œ์›จ์–ด์™€ ๊ณผ์ œ

ํ”Œ๋žซํผ์€ 16-DOF Allegro hand์— XELA uSkin ์„ผ์„œ๋ฅผ ์†๊ฐ€๋ฝ ๋ยท์ง€๊ณจ(phalange)ยท์†๋ฐ”๋‹ฅ์— ๋ถ€์ฐฉํ•œ ๊ตฌ์„ฑ์ด๋‹ค. ์ •์ฑ…์€ blind ๋‹ค. ์‹œ๊ฐ ์—†์ด ๊ณ ์œ ์ˆ˜์šฉ์„ฑ(proprioception: ํ˜„์žฌยท๋ช…๋ น ๊ด€์ ˆ๊ฐ)๊ณผ ์ ‘์ด‰ ๊ด€์ธก๋งŒ์œผ๋กœ ๋™์ž‘ํ•œ๋‹ค.

๋‘ ๊ณผ์ œ๊ฐ€ ๊นŒ๋‹ค๋กœ์šด ์ง„์งœ ์ด์œ ๋Š” ์ด์ฐจ ์ ‘์ด‰(secondary contact) ์— ์žˆ๋‹ค. ๋ณดํ†ต์˜ in-hand ์กฐ์ž‘์€ ์†๊ณผ ๋ฌผ์ฒด ์‚ฌ์ด์˜ ์ผ์ฐจ ์ ‘์ด‰๋งŒ ๋‹ค๋ฃฌ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ๋ฌผ์ฒด์™€ ๋˜ ๋‹ค๋ฅธ ๋ฌผ์ฒด(๊ตฌ๋ฉ, ํŒ) ์‚ฌ์ด์˜ ์ ‘์ด‰ ์ƒํƒœ๋ฅผ ์ผ์ฐจ ์ ‘์ด‰์˜ ์ด‰๊ฐ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ด ๊ฐ„์ ‘์ ์œผ๋กœ ์ถ”๋ก  ํ•ด์•ผ ํ•œ๋‹ค. ์†๋์œผ๋กœ ๋ชป์„ ์ฅ” ์ฑ„, ๋ชป ๋จธ๋ฆฌ๊ฐ€ ๊ตฌ๋ฉ ๊ฐ€์žฅ์ž๋ฆฌ์— ์–ด๋–ป๊ฒŒ ๋‹ฟ๋Š”์ง€๋ฅผ ์†๋ ๊ฐ๊ฐ๋งŒ์œผ๋กœ ์ฝ์–ด๋‚ด๋Š” ์‹์ด๋‹ค.

๋น„๊ต ๋Œ€์ƒ(baseline)์ด ์ž˜ ์งœ์—ฌ ์žˆ์–ด์„œ ablation์ด ๊น”๋”ํ•˜๋‹ค.

์•ฝ์–ด ํ‘œํ˜„ ์„ฑ๊ฒฉ
base proprioception only ์ ‘์ด‰ ์ •๋ณด ์—†์Œ
bin ์„ผ์‹ฑ ๋ฐฐ์—ด๋ณ„ ์ด์ง„ ์ ‘์ด‰ ๊ฐ€์žฅ ๊ฑฐ์นœ ํ‘œํ˜„
mag CoP ํž˜ ํฌ๊ธฐ(์Šค์นผ๋ผ) ๋ฐฉํ–ฅยท์œ„์น˜ ๋ฒ„๋ฆผ
vec ํž˜ ๋ฒกํ„ฐ๋งŒ CoP์˜ ์ ˆ๋ฐ˜
pos ์ ‘์ด‰ ์œ„์น˜๋งŒ CoP์˜ ๋‚˜๋จธ์ง€ ์ ˆ๋ฐ˜
taxel ์›์‹œ ํƒ์…€ ํž˜ ๊ฐ€์žฅ ํ’๋ถ€, ๊ฐ€์žฅ ๊ณ ์ฐจ์›
cop ํž˜ ๋ฒกํ„ฐ + ์œ„์น˜ (์ œ์•ˆ) ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์ค‘๊ฐ„
human ์‚ฌ๋žŒ ์ „๋ฌธ๊ฐ€ ์ƒํ•œ์„  ์ฐธ์กฐ

ํ•™์Šต์€ IsaacLab + asymmetric actor-critic PPO. actor๋Š” ๊ณ ์œ ์ˆ˜์šฉ์„ฑ๊ณผ ์ ‘์ด‰ ํ‘œํ˜„์„, critic์€ ์ถ”๊ฐ€๋กœ ๋ฌผ์ฒด ์ƒํƒœ ๊ฐ™์€ ํŠน๊ถŒ ์ •๋ณด๋ฅผ ๋ณธ๋‹ค. ์ •์ฑ…๋ง์€ history๋ฅผ ๊ธธ๊ฒŒ ์Œ“์€ MLP ๋Œ€์‹  GRU ๊ธฐ๋ฐ˜ recurrent ๊ตฌ์กฐ ๋ฅผ ์“ด๋‹ค. ๊ด€์ธก ์ฐจ์›์„ ๋Š˜๋ฆฌ์ง€ ์•Š๊ณ  ์‹œ๊ฐ„ ๋งฅ๋ฝ์„ ์ฃผ๋Š” ๊ฒŒ sample efficiency์™€ ์ˆ˜๋ ด ํ’ˆ์งˆ์—์„œ ๋” ๋‚˜์•˜๋‹ค(๋ถ€๋ก D, Fig. 12). MLP๋Š” history๋ฅผ ๋Š˜๋ ค๋„ ์ด๋“์ด ํฌํ™”๋œ๋‹ค.

๊ณผ์ œ 1: Peg-in-Hole ์‚ฝ์ž…

์›๊ธฐ๋‘ฅ ์†์žก์ด์— 6๊ฐ€์ง€ ๋จธ๋ฆฌ ๋ชจ์–‘(์›ยท๋‹ค์ด์•„๋ชฌ๋“œยทํƒ€์›ยท์œก๊ฐํ˜•ยท์‚ฌ๊ฐํ˜•ยท์‚ผ๊ฐํ˜•)์„ ๋‹จ ๋ชป๊ณผ ๊ตฌ๋ฉ์„ ์ง์ ‘ ์ œ์ž‘ํ–ˆ๋‹ค. ๊ตฌ๋ฉ์€ xยทy ๋ฐฉํ–ฅ์œผ๋กœ 10% ํ‚ค์›Œ ์‚ฝ์ž… ๊ณต์ฐจ๋ฅผ ์คฌ๋‹ค. ๊ณต์ฐจ๊ฐ€ 0์ด๋ฉด ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ์˜ ๋ฉ”์‹œ ๊ทผ์‚ฌยท์ˆ˜์น˜ ์˜ค์ฐจ ํƒ“์— โ€œjamming(๋ผ์ž„)โ€์ด ์ƒ๊ธฐ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋ฆฌ์…‹๋งˆ๋‹ค ๋ชป์˜ yaw๋ฅผ ์™„์ „ํžˆ ๋ฌด์ž‘์œ„ํ™”ํ•œ๋‹ค.

๊ฒฐ๊ณผ(Table 1)์˜ ํ•ต์‹ฌ ์ˆ˜์น˜๋ฅผ ์˜ฎ๊ธฐ๋ฉด:

ํ‘œํ˜„ Overall ์„ฑ๊ณต๋ฅ  OOD ์ดˆ๊ธฐํ™” ์„ฑ๊ณต๋ฅ  40% ์„ผ์„œ ๋งˆ์Šคํ‚น ์„ฑ๊ณต๋ฅ 
base 0.43 0.17 โ€”
bin 0.53 0.20 0.52
mag 0.55 0.27 0.48
vec 0.67 0.42 0.57
pos 0.50 0.28 0.48
taxel 0.48 0.27 0.30
cop (ours) 0.78 0.63 0.62
human 1.00 โ€” โ€”

์ฝ์–ด๋‚ผ ์ ์ด ์—ฌ๋Ÿฟ์ด๋‹ค.

  • CoP๊ฐ€ ์ „ ์˜์—ญ์—์„œ 1์œ„. ํŠนํžˆ OOD ์ดˆ๊ธฐํ™”์—์„œ ๊ฒฉ์ฐจ๊ฐ€ ๋ฒŒ์–ด์ง„๋‹ค(0.63 vs ์ฐจ์ˆœ์œ„ 0.42). ํ’๋ถ€ํ•œ ์ ‘์ด‰ ์ •๋ณด๊ฐ€ ์žˆ์–ด์•ผ ์ฒซ ์‚ฝ์ž… ์‹œ๋„๊ฐ€ ์‹คํŒจํ–ˆ์„ ๋•Œ ๋ชป์„ ๋‹ค์‹œ ์† ์•ˆ์—์„œ ์ด๋™ยท์žฌ์ •๋ ฌํ•ด ํšŒ๋ณตํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‹จ์ˆœ ํ‘œํ˜„(baseยทbin)์€ ํŠน์ • yaw์—์„œ ๋น ๋ฅธ ์‚ฝ์ž…์€ ๋˜์ง€๋งŒ, ์‹คํŒจ ํ›„ ํšŒ๋ณต์ด ์•ˆ ๋œ๋‹ค.
  • vec๊ณผ pos๋Š” ์ƒ๋ณด์ . ํž˜ ๋ฒกํ„ฐ๋งŒ, ์œ„์น˜๋งŒ ์“ฐ๋ฉด CoP๋ณด๋‹ค ๋–จ์–ด์ง„๋‹ค. ๋‘˜์„ ํ•ฉ์ณ์•ผ ์ง„๊ฐ€๊ฐ€ ๋‚œ๋‹ค. ablation์ด ํ‘œํ˜„ ์„ค๊ณ„๋ฅผ ์ •๋‹นํ™”ํ•œ๋‹ค.
  • taxel์ด ๊ฑฐ์˜ ๊ผด์ฐŒ. ์›์‹œ ํƒ์…€์ด ๊ฐ€์žฅ ์ •๋ณด๊ฐ€ ๋งŽ์€๋ฐ๋„ ์„ฑ๋Šฅ์ด ๋‚˜์˜๋‹ค. ๋ถˆ์™„์ „ํ•œ ์ด‰๊ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ + ๊ณ ์ฐจ์› + ์„ผ์„œ ํŠนํ™” ๋ถˆ์ผ์น˜๊ฐ€ ๊ฒน์นœ ํƒ“์ด๋‹ค. ์ด๊ฑด โ€œํ’๋ถ€ํ•จ์ด ๊ณง ์ „์ด ์„ฑ๋Šฅ์€ ์•„๋‹ˆ๋‹คโ€๋ผ๋Š” ์ด ๋…ผ๋ฌธ์˜ ์ฃผ์žฅ์„ ์‹ค์ฆํ•œ๋‹ค.
  • ์†๋„์˜ trade-off. ๊ณ fidelity ํ‘œํ˜„(vecยทcop)์€ ์„ฑ๊ณต๋ฅ ์ด ๋†’์€ ๋Œ€์‹  ์™„๋ฃŒ ์‹œ๊ฐ„์ด ๊ธธ๋‹ค. ๋” ์ ์‘์ ยท๋ˆ์งˆ๊ธฐ๊ฒŒ ์›€์ง์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.
  • ์„ผ์„œ ๋งˆ์Šคํ‚น ๊ฒฌ๊ณ ์„ฑ. 40% ํƒ์…€์„ ๋งค ์Šคํ… ๋ฌด์ž‘์œ„๋กœ ๊ฐ€๋ ธ์„ ๋•Œ, ๊ณ fidelity ํ‘œํ˜„์ด ๋‹จ์ˆœ ํ‘œํ˜„๋ณด๋‹ค ๋” ํฌ๊ฒŒ ์ €ํ•˜๋œ๋‹ค(๊ฐœ๋ณ„ ํƒ์…€ ํž˜์— ๋ฏผ๊ฐํ•˜๋ฏ€๋กœ). ๊ทธ๋ž˜๋„ CoP๋Š” 0.62๋กœ ์—ฌ์ „ํžˆ ์ตœ๊ณ . ํ‘œํ˜„์˜ ์ง‘๊ณ„์ (aggregate) ์„ฑ๊ฒฉ์ด ์–ด๋А ์ •๋„ ์™„์ถฉ ์—ญํ• ์„ ํ•œ๋‹ค.
  • ์‚ฌ๋žŒ์€ ์—ฌ์ „ํžˆ ์••๋„์ . ์‚ฌ๋žŒ์€ ์ด‰๊ฐ์— ๋”ํ•ด ๊ณ ์ˆ˜์ค€ ๊ธฐํ•˜ ์ถ”๋ก ๊ณผ ํƒ์ƒ‰ ์ „๋žต์„ ์“ฐ๋Š” ๋ฐ˜๋ฉด, ๋กœ๋ด‡ ์ •์ฑ…์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ํ•™์Šตํ•œ ๋ฐ˜์‘์ (reactive) ํ”ผ๋“œ๋ฐฑ์— ์˜์กดํ•œ๋‹ค๋Š” ํ•ด์„.

๊ณผ์ œ 2: ๊ณต ๊ท ํ˜• ์žก๊ธฐ (Ball Balancing)

๋„ค ์†๊ฐ€๋ฝ ๋์œผ๋กœ ๊ฐ€๋ฒผ์šด(50g) ์ •์‚ฌ๊ฐ ํŒ์„ ๋ฐ›์น˜๊ณ , ๊ทธ ์œ„์˜ ๊ณต์„ ๊ฐ€์šด๋ฐ๋กœ ๋ชจ์•„ ๋–จ์–ด๋œจ๋ฆฌ์ง€ ์•Š๋Š” ๊ณผ์ œ. ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„  ๋งค๋ˆํ•œ ๊ตฌ๋กœ ํ•™์Šตํ•˜๊ณ , ์‹ค์ œ๋ก  ์งˆ๋Ÿ‰ยทํฌ๊ธฐยท๋งˆ์ฐฐยทํ‘œ๋ฉด์ด ๋‹ค๋ฅธ ๋„ค ์ข…๋ฅ˜ ๊ณต(ํ…Œ๋‹ˆ์Šคยท์•ผ๊ตฌยท๋ฌธ๋ณผยทํ•˜ํ‚ค๋ณผ)์œผ๋กœ ํ‰๊ฐ€ํ•œ๋‹ค. ํ•™์Šต ๋ถ„ํฌ ๋ฐ–์ด๋ผ ๊ตฌ๋ฅด๋Š” ๊ฑฐ๋™์ด ์ œ๊ฐ๊ฐ์ด๋‹ค. ๊ฒŒ๋‹ค๊ฐ€ ๋น„๊ท ์ผ ๋งˆ์ฐฐยท๊ธฐ์–ด ๋ฐฑ๋ž˜์‹œ๊ฐ€ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ๊นŽ๊ณ , ์†๊ฐ€๋ฝ ํ•˜๋‚˜๋Š” ํƒ์…€ ์˜์—ญ์ด ํŒ์— ์•ˆ ๋‹ฟ์•„ ์ ‘์ด‰ ์ •๋ณด๊ฐ€ ์—†์œผ๋ฉฐ, ์†๊ฐ€๋ฝ 3/4๊ฐ€ ํŠน์ด์ (singularity) ๊ทผ์ฒ˜๋ผ ๋น ๋ฅธ ์›€์ง์ž„์ด ์–ด๋ ต๋‹ค. ์ง€ํ‘œ๋Š” time-to-fall(TTF, ํด์ˆ˜๋ก ์ข‹์Œ).

ํ‘œํ˜„ Overall TTF (s)
base 1.38
bin 1.99
mag 2.40
vec 4.52
pos 1.55
taxel 1.49
cop (ours) 4.60
human 9.37

์—ฌ๊ธฐ์„œ ๋ฉ”์‹œ์ง€๊ฐ€ ๋” ๋‚ ์นด๋กญ๋‹ค. ์ˆ˜์น˜์  ํž˜ ์ •๋ณด๊ฐ€ ์žˆ์–ด์•ผ๋งŒ ํ•™์Šต์ด ๋œ๋‹ค. ๋ช…์‹œ์  ํž˜์ด ์—†๋Š” baseยทbinยทpos๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ์กฐ์ฐจ ๊ณผ์ œ๋ฅผ ๋ชป ๋ฐฐ์šด๋‹ค(Fig. 13b). copยทvecยทtaxel๋งŒ ์„ฑ๊ณตํ•˜๊ณ , ๊ทธ์ค‘์—์„œ๋„ cop๊ณผ vec์ด ๋น„๋“ฑํ•˜๋‹ค. ์ฆ‰ ์ด ๊ณผ์ œ์—์„  ์œ„์น˜๋ณด๋‹ค ํž˜์ด ๊ฒฐ์ •์ ์ด๋ฉฐ, ํž˜ ๋ฒกํ„ฐ๋งŒ์œผ๋กœ๋„ ์ถฉ๋ถ„ํ•  ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•œ๋‹ค.

Fig. 5์— ํฅ๋ฏธ๋กœ์šด emergent ๋™์ž‘์ด ๋‘˜ ๋ณด์ธ๋‹ค. ๊ณต๊ฒฉ์ ์ธ ๋‹จ์ผ ์Šคํ… ๊ฐ€์†-๊ฐ์†(accelerate-decelerate) ๊ธฐ๋™๊ณผ, ๋А๋ฆฐ 2๋‹จ๊ณ„ ์„ผํ„ฐ๋ง. ๋˜ ๋กœ๋ด‡์€ ๋งค๋ˆํ•˜๊ณ  ๋นจ๋ฆฌ ๊ตฌ๋ฅด๋Š” ํ•˜ํ‚ค๋ณผ์— ์•ฝํ•œ ๋ฐ˜๋ฉด, ์‚ฌ๋žŒ์€ ์ฐŒ๊ทธ๋Ÿฌ์ง„ ๋ฌธ๋ณผ(๋น„์„ ํ˜•ยท์˜ˆ์ธก๋ถˆ๊ฐ€ ๊ตฌ๋ฆ„)์— ๊ฐ€์žฅ ์•ฝํ•˜๋‹ค. ์‚ฌ๋žŒ์€ ์„ ํ˜• ๋ชจ๋ธ๋กœ ๋ฏธ๋ž˜๋ฅผ ์™ธ์‚ฝ(extrapolation)ํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์–ด ๋น„์„ ํ˜• ๊ฑฐ๋™์— ์ทจ์•ฝํ•˜๊ณ , ๋กœ๋ด‡ ์ •์ฑ…์€ ์ฆ‰๊ฐ์  ์ƒํƒœ์— ๋ฐ˜์‘์ ์ด๋ผ๋Š” ๋Œ€์กฐ๋‹ค.

ํ•™์Šต๋œ ์ •์ฑ…์ด ๋ฌผ๋ฆฌ๋ฅผ โ€œ์ดํ•ดโ€ํ•˜๋Š”๊ฐ€

์—ฌ๊ธฐ๊ฐ€ ๋กœ๋ด‡๊ณตํ•™์ž์—๊ฒŒ ๊ฐ€์žฅ ์ž๊ทน์ ์ธ ๋ถ„์„์ด๋‹ค. ์ •์ฑ…๋ง recurrent layer์˜ 256์ฐจ์› ์ž ์žฌ๋ฅผ ๋“ค์—ฌ๋‹ค๋ณธ๋‹ค.

๋ฌผ์ฒด ์ƒํƒœ ์˜ˆ์ธก (linear probing). ์ž ์žฌ๋กœ๋ถ€ํ„ฐ ๊ณต์˜ xy ์œ„์น˜ยท์†๋„๋ฅผ ์„ ํ˜• ํ”„๋กœ๋น™์œผ๋กœ ์˜ˆ์ธกํ•œ๋‹ค.

x pos y pos x vel y vel
RMSE (m) 0.013 0.019 0.059 0.065
r^2 0.76 0.62 0.23 0.15

์œ„์น˜๋Š” ์ž˜ ์žก์ง€๋งŒ ์†๋„๋Š” ์•ฝํ•˜๋‹ค. ์ •์ฑ…์ด ์ ‘์ด‰ ์ •๋ณด๋กœ ์œ„์น˜๋Š” ์ถ”์ ํ•˜๋˜, ์šด๋™ ๋™์—ญํ•™์€ ์ •๋ฐ€ํ•˜๊ฒŒ ์ธ์ฝ”๋”ฉํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ๋œป์ด๋‹ค. ์ ‘์ด‰ ๊ธฐ๋ฐ˜ ์ƒํƒœ ์ถ”์ •์˜ ๋ณธ์งˆ์  ๋…ธ์ด์ฆˆ ํƒ“์ผ ๊ฒƒ์ด๋‹ค.

์•”๋ฌต์  ์งˆ๋Ÿ‰ ์‹๋ณ„ (PCA). ์„ธ ๊ฐ€์ง€ ์งˆ๋Ÿ‰(50ยท150ยท250g) ๊ณต์˜ ๊ถค์ ์—์„œ ์ž ์žฌ๋ฅผ ๋ชจ์•„ PCA๋กœ 2์ฐจ์›์— ํˆฌ์˜ํ•œ๋‹ค. Fig. 7์ด ๋ณด์—ฌ์ฃผ๋Š” ๊ฑด ๊ฐ•๋ ฌํ•˜๋‹ค. ๊ถค์ ์ด ์‹œ๊ฐ„์— ๋”ฐ๋ผ ์ง„ํ–‰๋ ์ˆ˜๋ก, ์ž ์žฌ ์ž„๋ฒ ๋”ฉ์ด ์งˆ๋Ÿ‰๋ณ„๋กœ ๋˜๋ ทํ•œ ๊ตฐ์ง‘์œผ๋กœ ์Šค์Šค๋กœ ์žฌ์กฐ์ง ๋œ๋‹ค. Silhouette Coefficient๊ฐ€ T=0.05s์—์„œ -0.04์˜€๋‹ค๊ฐ€ T=4.55s์—์„œ 0.51๊นŒ์ง€ ๋‹จ์กฐ ์ฆ๊ฐ€ํ•œ๋‹ค. ๋ˆ„๊ตฌ๋„ ์งˆ๋Ÿ‰์„ ์ง€๋„(supervision)ํ•˜์ง€ ์•Š์•˜๋Š”๋ฐ, ์ •์ฑ…์ด ์ œ์–ด๋ฅผ ์ž˜ํ•˜๋ ค๋‹ค ๋ณด๋‹ˆ ๋ฌผ๋ฆฌ ์†์„ฑ์„ ๋ถ€์‚ฐ๋ฌผ๋กœ ํ‘œ์ƒํ•˜๊ฒŒ ๋œ ๊ฒƒ์ด๋‹ค.

์ด๊ฒŒ ์™œ ์ค‘์š”ํ•œ๊ฐ€: ๋ฌผ๋ฆฌ์— ๋ฟŒ๋ฆฌ๋‚ด๋ฆฐ ํ‘œํ˜„(CoP)์„ ์ž…๋ ฅ์œผ๋กœ ์ฃผ๋ฉด, ์ •์ฑ…์˜ ๋‚ด๋ถ€ ์ƒํƒœ๋„ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์˜๋ฏธ ์žˆ๋Š” ๋ณ€์ˆ˜(์งˆ๋Ÿ‰ ๊ฐ™์€)๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ๊ตฌ์กฐํ™”๋˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค. ์ด๊ฑด ๋‹จ์ง€ โ€œ์ž˜ ๋œ๋‹คโ€๋ฅผ ๋„˜์–ด, ํ‘œํ˜„์˜ ๋ฌผ๋ฆฌ์  ๊ทผ๊ฑฐ๊ฐ€ ์ •์ฑ…์˜ ํ•ด์„๊ฐ€๋Šฅ์„ฑ๊ณผ ์ผ๋ฐ˜ํ™” ๊ฐ€๋Šฅ์„ฑ์œผ๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ๋‹ค ๋Š” ์ฆ๊ฑฐ๋‹ค. ๋ฌผ๋ฆฌ๋ฅผ ์ž…๋ ฅ์— ์‹ฌ์œผ๋ฉด ๋ฌผ๋ฆฌ๊ฐ€ ํ‘œ์ƒ์—์„œ ์ž๋ผ๋‚œ๋‹ค.


๋น„ํŒ์  ๊ณ ์ฐฐ: ๊ฐ•์ ๊ณผ ํ•œ๊ณ„

๊ฐ•์ 

  • ํ‘œํ˜„ ์„ค๊ณ„์˜ ์›์น™์„ฑ. CoP๋Š” ad-hoc feature๊ฐ€ ์•„๋‹ˆ๋ผ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ ์ ‘์ด‰ ๋ฌผ๋ฆฌ๋Ÿ‰๊ณผ ๋™ํ˜•(ๅŒๅฝข)์ด๋‹ค. ์ด ํ•œ ๊ฐ€์ง€ ๊ฒฐ์ •์ด distillation ์ œ๊ฑฐ, ์ง์ ‘ ์ „์ด, ๊ฐ€๋ฒผ์šด ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜์„ ์ค„์ค„์ด ๊ฐ€๋Šฅ์ผ€ ํ•œ๋‹ค. ์šฐ์•„ํ•œ ์„ค๊ณ„๋Š” ๋ถ€์ˆ˜ ํšจ๊ณผ๊ฐ€ ๋งŽ๋‹ค.
  • ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜์˜ ์‹ค์šฉ์„ฑ. ๊ณ ๊ฐ€ ํž˜ ์„ผ์„œ ์—†์ด, 2๋ถ„ ๋ฐ์ดํ„ฐ์™€ ๋กœ๋ด‡ ํ† ํฌ๋งŒ์œผ๋กœ ํƒ์…€ ๋ฐฉํ–ฅ์„ ํ•™์Šตํ•œ๋‹ค. ๋ฏธ๋ถ„๊ฐ€๋Šฅ ๋™์—ญํ•™์„ ์—ญ์ „ํŒŒํ•œ๋‹ค๋Š” ๋ฐœ์ƒ์ด ๊น”๋”ํ•˜๊ณ , ์‹ ๊ฒฝ๋ง ๋งคํ•‘์œผ๋กœ ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ผ๋ฐ˜์„ฑ๊นŒ์ง€ ๋‚จ๊ฒจ๋’€๋‹ค.
  • ์ •์งํ•œ ablation. vec/pos/mag/taxel/binary๋ฅผ ๋ชจ๋‘ ๋น„๊ตํ•ด, CoP์˜ ๋‘ ์„ฑ๋ถ„์ด ์ƒ๋ณด์ ์ด๋ฉฐ ์›์‹œ ์‹ ํ˜ธ๊ฐ€ ์˜คํžˆ๋ ค ํ•ด๋กญ๋‹ค๋Š” ๊ฑธ ๋ฐ์ดํ„ฐ๋กœ ๋ณด์ธ๋‹ค. ์ฃผ์žฅ๊ณผ ์ฆ๊ฑฐ๊ฐ€ ์ผ๋Œ€์ผ๋กœ ๋ถ™๋Š”๋‹ค.
  • ๋ฌผ๋ฆฌ ํ‘œ์ƒ์˜ emergence ๋ถ„์„. ๋‹จ์ˆœ ์„ฑ๋Šฅํ‘œ๋ฅผ ๋„˜์–ด ์ž ์žฌ ๊ณต๊ฐ„์„ ํ•ด๋ถ€ํ•ด, ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์ž…๋ ฅ์ด ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ํ‘œ์ƒ์„ ๋‚ณ๋Š”๋‹ค๋Š” ํ†ต์ฐฐ์„ ์ œ์‹œํ•œ๋‹ค. ํ›„์† ์—ฐ๊ตฌ์˜ ๊ฐ€์„ค์„ ๋˜์ง€๋Š” ๊ธฐ์—ฌ๋‹ค.
  • ๊ณผ์ œ ์„ค๊ณ„. ์‹œ๊ฐ์„ ๋ฐฐ์ œํ•˜๊ณ  ์ด์ฐจ ์ ‘์ด‰ ์ถ”๋ก ์„ ์š”๊ตฌํ•˜๋Š” blind ๊ณผ์ œ๋กœ, ์ด‰๊ฐ์˜ ๊ธฐ์—ฌ๋ฅผ ๊นจ๋—์ด ๋ถ„๋ฆฌํ–ˆ๋‹ค.

ํ•œ๊ณ„ (์ €์ž ์ž์ธ + ์ถ”๊ฐ€ ๊ด€์ฐฐ)

  • Fidelity vs. Transferability. CoP๋Š” ์˜๋„์ ์œผ๋กœ raw ์ •๋ณด๋ฅผ ๋ฒ„๋ฆฐ๋‹ค. ๋” ๋ณต์žกํ•œ ์กฐ์ž‘์—์„  ๊ทธ ๋ฒ„๋ฆฐ ๋””ํ…Œ์ผ์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ •ํ™•ํ•œ ์„ผ์„œ๋ณ„ ๋ชจ๋ธ๊ณผ ์ง์ง€์€ raw ํ‘œํ˜„์ด ๋” ๋†’์€ ์„ฑ๋Šฅ์„ ๋‚ผ ์—ฌ์ง€๋Š” ์—ฌ์ „ํžˆ ๋‚จ๋Š”๋‹ค.
  • ์ „๋‹จ๋ ฅ ํฌ๊ธฐ. ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ์˜ ์ „๋‹จ ์ถ”์ •์ด ๋ถˆ์•ˆ์ •ํ•ด ๋ฒ•์„  ์„ฑ๋ถ„๋งŒ ์ผ๋‹ค. ๋ฏธ๋„๋Ÿฌ์ง(slip) ๊ฐ์ง€์ฒ˜๋Ÿผ ์ „๋‹จ์ด ๋ณธ์งˆ์ ์ธ ๊ณผ์ œ์—๋Š” ํ˜„์žฌ ํ˜•ํƒœ๋กœ ๋ถ€์กฑํ•˜๋‹ค. ์ด๊ฑด CoP์˜ ๊ฒฐํ•จ์ด๋ผ๊ธฐ๋ณด๋‹ค ํ˜„ ์ ‘์ด‰ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์˜ ํ•œ๊ณ„์ง€๋งŒ, ์–ด์จŒ๋“  ์ ์šฉ ๋ฒ”์œ„๋ฅผ ์ œ์•ฝํ•œ๋‹ค.
  • ์ ‘์ด‰ ๋ฒ”์œ„ ๋ถˆ์ผ์น˜. ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋Š” task-object ์ ‘์ด‰๋งŒ ๋ณด๊ณ ํ•˜๋Š”๋ฐ, ์‹ค์ œ ์„ผ์„œ๋Š” self-collisionยทํ™˜๊ฒฝ ์ ‘์ด‰๊นŒ์ง€ ๋‹ค ๋А๋‚€๋‹ค. ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ์—์„  OOD ์ด‰๊ฐ ๊ด€์ธก์„ ์œ ๋ฐœํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ๊ณ ์ • ๋ฒ ์ด์Šค, ๋‹จ์ผ ์„ผ์„œ ์ข…๋ฅ˜. ํ‘œํ˜„์˜ ํšจ๊ณผ๋ฅผ ๋ถ„๋ฆฌํ•˜๋ ค fixed-base ์† + XELA uSkin์œผ๋กœ ํ•œ์ •ํ–ˆ๋‹ค. arm-hand ์‹œ์Šคํ…œ, ์ „(ๅ…จ)์† ์ด‰๊ฐ, ๋‹ค๋ฅธ ์„ผ์„œ๋กœ์˜ ํ™•์žฅ์€ ๋ฏธ๋ž˜ ๊ณผ์ œ๋‹ค.
  • ์‚ฌ๋žŒ๊ณผ์˜ ํฐ ๊ฒฉ์ฐจ. ๋‘ ๊ณผ์ œ ๋ชจ๋‘ ์‚ฌ๋žŒ์ด ์••๋„ํ•œ๋‹ค. ๋ฐ˜์‘์  ์ •์ฑ…์˜ ํ•œ๊ณ„์ด๋ฉฐ, ๊ณ ์ˆ˜์ค€ ๊ธฐํ•˜ ์ถ”๋ก ยท์˜ˆ์ธก์  ์ œ์–ด์˜ ๋ถ€์žฌ๋ฅผ ๋“œ๋Ÿฌ๋‚ธ๋‹ค.
  • ์ถ”๊ฐ€ ๊ด€์ฐฐ โ€” ๋‹จ์ผ ์ ‘์ด‰ ๊ฐ€์ •. CoP๋Š” ์†๊ฐ€๋ฝ ๋๋‹น ํ•˜๋‚˜์˜ ํ•ฉ๋ ฅยท์ค‘์‹ฌ์ ์œผ๋กœ ํ™˜์›ํ•œ๋‹ค. ํ•œ ์†๊ฐ€๋ฝ์ด ๋™์‹œ์— ์—ฌ๋Ÿฌ ๋–จ์–ด์ง„ ์ง€์ ๊ณผ ์ ‘์ด‰ํ•˜๋Š” ๋‹ค์ ‘์ด‰ ์ƒํ™ฉ์€ ๋ณธ์งˆ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜์ง€ ๋ชปํ•œ๋‹ค(์ €์ž๋„ ๊ฐ์ฃผ์—์„œ ์ธ์ •). ๋ณต์žกํ•œ in-hand regrasp๋‚˜ ๋„๊ตฌ ์กฐ์ž‘์—์„œ ์ œ์•ฝ์ด ๋  ์ˆ˜ ์žˆ๋‹ค.

๊ด€๋ จ ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต

์ด‰๊ฐ sim-to-real์˜ ์ง€ํ˜• ์œ„์— ์ด ๋…ผ๋ฌธ์„ ๋†“์•„ ๋ณด์ž.

๊ฐˆ๋ž˜ ๋Œ€ํ‘œ ์—ฐ๊ตฌ ์ ‘์ด‰ ํ‘œํ˜„ sim-to-real ๋ฐฉ์‹ ํ•œ๊ณ„
์•”๋ฌต์  ์ ‘์ด‰ proprioceptive control error ๊ธฐ๋ฐ˜ ๊ด€์ ˆ ์ถ”์ข… ์˜ค์ฐจ ์ง์ ‘ ์„ฑ๋Šฅ ์ด๋“ ์ œํ•œ์ 
๋‹จ์ˆœ ์ด์‚ฐ Rotating without seeing, Touch RL binary / ternary ์ง์ ‘(์•ˆ์ •) ๋””ํ…Œ์ผ ์†์‹ค
๊ทน์ขŒํ‘œ ํž˜+์œ„์น˜ AnyRotate ํž˜ ํฌ๊ธฐ + ๊ทน์ขŒํ‘œ ์œ„์น˜ ํ•™์Šต๋œ encoder in-hand rotation์— ํ•œ์ •, encoder ์˜์กด
์›์‹œ ๋ชจ๋ธ๋ง shear/normal skin, PTLD ๋“ฑ raw taxel / latent teacher-student distillation ์„ผ์„œ ํŠนํ™”, ํ•ด์„ ์–ด๋ ค์›€, ์ •๋ ฌ ๋‚œํ•ด
๊ธฐํ•˜ ์ผ๊ด€ TacMap, HydroShear penetration depth map / hydroelastic shear sim ๋ชจ๋ธ ์ •๊ตํ™” ๋ณต์žกํ•œ ๋ชจ๋ธ๋งยท์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜ ๋น„์šฉ
๋ณธ ์—ฐ๊ตฌ CoP 3D ํž˜ + 3D ์œ„์น˜ (๋ฌผ๋ฆฌ ๋™ํ˜•) ์ง์ ‘ ์ „์ด(distillation ๋ถˆํ•„์š”) ๋ฒ•์„ ๋งŒ, ๋‹จ์ผ ์ ‘์ด‰, ๊ณ ์ • ๋ฒ ์ด์Šค

์œ„์น˜ ์žก๊ธฐ๋Š” ์ด๋ ‡๋‹ค. AnyRotate๋Š” ํž˜ ํฌ๊ธฐ + ๊ทน์ขŒํ‘œ ์œ„์น˜๋ผ๋Š” ๋น„์Šทํ•œ ๋ฐœ์ƒ์„ ์ผ์ง€๋งŒ ์—ฌ์ „ํžˆ ํ•™์Šต๋œ tactile encoder์— ์˜์กดํ–ˆ๊ณ  ํ‰๊ฐ€๊ฐ€ in-hand rotation์— ๊ฐ‡ํ˜€ ์žˆ์—ˆ๋‹ค. CoP๋Š” encoder ์—†์ด ๋‹ซํžŒ ํ•ด๋กœ ์–‘๋ฐฉํ–ฅ ๋งคํ•‘์„ ํ’€๊ณ , ํ‘œํ˜„์ด ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ ๋ฌผ๋ฆฌ๋Ÿ‰๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋ผ ์ง์ ‘ ์ „์ด๊ฐ€ ๋œ๋‹ค. TacMapยทHydroShear ๊ณ„์—ด์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์˜ ์ ‘์ด‰ ๋ฌผ๋ฆฌ ์ž์ฒด๋ฅผ ๋” ์ •๊ตํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ์ •๊ณต๋ฒ•์ธ๋ฐ, ๊ทธ๋งŒํผ ๋ชจ๋ธ๋งยท์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜ ๋ถ€๋‹ด์ด ํฌ๋‹ค. CoP๋Š” ๋ฐ˜๋Œ€๋กœ โ€œ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ๋” ์ •ํ™•ํ•˜๊ฒŒโ€๊ฐ€ ์•„๋‹ˆ๋ผ โ€œ์–‘์ชฝ์ด ํ•ฉ์˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ถ”์ƒ ์ˆ˜์ค€์œผ๋กœ ์˜ฌ๋ผ๊ฐ€์„œโ€ ๋ฌธ์ œ๋ฅผ ํ‘ผ๋‹ค.

์—ฌ๋Ÿฌ๋ถ„ ์—ฐ๊ตฌ ๋งฅ๋ฝ(Allegro V5 + DIGIT, TACTO/TacSL, CTR vs DeXtreme)๊ณผ ์ง์ ‘ ๋‹ฟ๋Š” ์ง€์ ๋„ ๋ถ„๋ช…ํ•˜๋‹ค. DeXtremeยทRubikโ€™s cube๋ฅ˜๊ฐ€ ์ €์ฐจ์› ์ƒํƒœ๋กœ ํ’€๋˜ sim-to-real์„ ์ด‰๊ฐ ์ฐจ์›์œผ๋กœ ๋ฐ€์–ด ์˜ฌ๋ฆฌ๋˜, ์›์‹œ ์‹ ํ˜ธ์˜ ํ•จ์ •์„ ๋ฌผ๋ฆฌ ์ถ”์ƒ์œผ๋กœ ํšŒํ”ผํ•œ ์‚ฌ๋ก€๋‹ค. DIGIT ๊ฐ™์€ vision-based ์ด‰๊ฐ ์„ผ์„œ๋กœ CoP๋ฅ˜ ํ‘œํ˜„์„ ๋ฝ‘์œผ๋ ค๋ฉด ์‘๋ ฅ ๋ถ„ํฌ ๋ชจ๋ธ์„ ๊ด‘ํ•™-๊ธฐ๋ฐ˜์œผ๋กœ ๋‹ค์‹œ ์„ค๊ณ„ํ•ด์•ผ ํ•œ๋‹ค๋Š” ํ›„์† ์งˆ๋ฌธ์ด ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๋”ฐ๋ผ์˜จ๋‹ค.


์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

์ด ๋…ผ๋ฌธ์˜ ํ•œ ๋ฌธ์žฅ์€ ์ด๋ ‡๋‹ค. ์ด‰๊ฐ sim-to-real์˜ ๋ณ‘๋ชฉ์€ โ€œ์ด‰๊ฐ์„ ๋” ์ •ํ™•ํžˆ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๊ธฐโ€๊ฐ€ ์•„๋‹ˆ๋ผ โ€œsim๊ณผ real์ด ํ•ฉ์˜ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌผ๋ฆฌ์  ์ถ”์ƒ ์ˆ˜์ค€์„ ์ฐพ๊ธฐโ€์˜€๋‹ค. CoP(3D ํž˜ ๋ฒกํ„ฐ + 3D ์ ‘์ด‰ ์œ„์น˜)๋Š” ๊ทธ ์ถ”์ƒ ์ˆ˜์ค€์ด ๊ฐ•์ฒด ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๊ฐ€ ์ด๋ฏธ ๋‚ด๋†“๋Š” ์ ‘์ด‰ ๋ฌผ๋ฆฌ๋Ÿ‰๊ณผ ๋™ํ˜•์ด๋ผ๋Š” ์ ์—์„œ ์˜๋ฆฌํ•˜๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋กœ teacher-student distillation ์—†๋Š” ์ง์ ‘ ์ „์ด, ํž˜ ์„ผ์„œ ์—†๋Š” ๋ฏธ๋ถ„๊ฐ€๋Šฅ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜, binary์™€ raw taxel์„ ๋ชจ๋‘ ์•ž์„œ๋Š” ์„ฑ๋Šฅ, ๊ทธ๋ฆฌ๊ณ  ์งˆ๋Ÿ‰ ๊ฐ™์€ ๋ฌผ๋ฆฌ ์†์„ฑ์ด ์ž ์žฌ ๊ณต๊ฐ„์—์„œ ์Šค์Šค๋กœ ๋– ์˜ค๋ฅด๋Š” emergence๊นŒ์ง€ ์ค„์ค„์ด ๋”ฐ๋ผ์˜จ๋‹ค.

๋กœ๋ด‡๊ณตํ•™ ์‹ค๋ฌด์ž๊ฐ€ ๊ฐ€์ ธ๊ฐˆ ๊ตํ›ˆ์„ ์ถ”๋ฆฌ๋ฉด:

  1. ํ‘œํ˜„์˜ ์ถ”์ƒ ์ˆ˜์ค€์„ sim-real ์ •๋ ฌ ๊ฐ€๋Šฅ์„ฑ์œผ๋กœ ์„ ํƒํ•˜๋ผ. โ€œ๊ฐ€์žฅ ํ’๋ถ€ํ•œ ์‹ ํ˜ธโ€๊ฐ€ ์•„๋‹ˆ๋ผ โ€œ์–‘์ชฝ์ด ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋งํ•  ์ˆ˜ ์žˆ๋Š” ์‹ ํ˜ธโ€๊ฐ€ ์ „์ด๋ฅผ ๊ฒฐ์ •ํ•œ๋‹ค. raw taxel์ด cop์— ์ง„ ๊ฒƒ์ด ์ด ๊ตํ›ˆ์˜ ์ฆ๊ฑฐ๋‹ค.
  2. ๋ฌผ๋ฆฌ์  ๊ทผ๊ฑฐ๋Š” ๊ณต์งœ ๋ถ€์‚ฐ๋ฌผ์„ ์ค€๋‹ค. ์ž…๋ ฅ์„ ๋ฌผ๋ฆฌ๋Ÿ‰์œผ๋กœ ์‹ฌ์œผ๋ฉด ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜์ด ์‰ฌ์›Œ์ง€๊ณ (์ •์—ญํ•™์œผ๋กœ ํ’€๋ฆผ), ํ‘œ์ƒ์ด ํ•ด์„๊ฐ€๋Šฅํ•ด์ง€๊ณ (์งˆ๋Ÿ‰ ๊ตฐ์ง‘), ์ผ๋ฐ˜ํ™” ๋‹จ์„œ๊ฐ€ ์ƒ๊ธด๋‹ค.
  3. distillation์€ ์ •๋ ฌ ์‹คํŒจ์˜ ์šฐํšŒ๋กœ์˜€๋‹ค. ํ‘œํ˜„์„ ์ž˜ ๋งž์ถ”๋ฉด ํ•œ ๋‹จ๊ณ„ ํ†ต์งธ๋กœ ์‚ฌ๋ผ์ง„๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ ๋‹จ์ˆœํ™” ์ž์ฒด๊ฐ€ ๊ฒฌ๊ณ ํ•จ์ด๋‹ค.
  4. ๋‹จ์ˆœํ™”์˜ ๋น„์šฉ์„ ๋ช…์‹œํ•˜๋ผ. ์ „๋‹จยท๋‹ค์ ‘์ด‰ยทํ™˜๊ฒฝ ์ ‘์ด‰์„ ๋ฒ„๋ฆฐ ๋Œ€๊ฐ€๋Š” ๋ถ„๋ช…ํ•˜๊ณ , ๊ทธ๊ฒŒ ์ ์šฉ ๋ฒ”์œ„์˜ ๊ฒฝ๊ณ„๋‹ค.

๋‚จ๋Š” ์งˆ๋ฌธ๋“ค๋„ ํ’๋ถ€ํ•˜๋‹ค. ๊ด‘ํ•™์‹ ์ด‰๊ฐ(DIGIT/GelSight)์œผ๋กœ CoP๋ฅผ ๋ฝ‘์œผ๋ ค๋ฉด ์‘๋ ฅ ๋ชจ๋ธ์„ ์–ด๋–ป๊ฒŒ ๋‹ค์‹œ ์„ธ์šธ๊นŒ. ์ „๋‹จ์„ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๊ฒŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์„ฑ๋Šฅ์€ ์–ด๋””๊นŒ์ง€ ์˜ค๋ฅผ๊นŒ. ๋‹ค์ ‘์ด‰์„ ํ‘œํ˜„ํ•˜๋ ค๋ฉด โ€œํ•˜๋‚˜์˜ CoPโ€๋ฅผ โ€œ์—ฌ๋Ÿฌ CoP์˜ ์ง‘ํ•ฉโ€์œผ๋กœ ํ™•์žฅํ•˜๋Š” ๊ฒŒ ๋‹ต์ผ๊นŒ. ๊ทธ๋ฆฌ๊ณ  ์ €์ž๋“ค์ด ๋˜์ง„ ๋Œ€๋กœ, CoP๋ฅผ RL ๋„ˆ๋จธ imitation learning์ด๋‚˜ sample-efficient real-world RL์˜ ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์ด‰๊ฐ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ๋กœ ์“ธ ์ˆ˜ ์žˆ์„๊นŒ. ์ด ๋…ผ๋ฌธ์€ ๋‹ต์ด๋ผ๊ธฐ๋ณด๋‹ค, ์ด‰๊ฐ ํ‘œํ˜„ ์„ค๊ณ„๋ฅผ โ€œ์ •ํ™•๋„ ๊ฒฝ์Ÿโ€์—์„œ โ€œ์ •๋ ฌ ๊ฐ€๋Šฅ์„ฑ ์„ค๊ณ„โ€๋กœ ์˜ฎ๊ฒจ๋†“์€ ์ข‹์€ ์ถœ๋ฐœ์ ์ด๋‹ค.

Copyright 2026, JungYeon Lee