Curieux.JY
  • JungYeon Lee
  • Post
  • Projects
  • Note

On this page

  • ๐Ÿ” Ping Review
  • ๐Ÿ”” Ring Review
    • ์„œ๋ก : โ€œ์†์ด 20๊ฐœ์˜ ๊ด€์ ˆ์ด๋ฉด ํŒŒ์ง€๊ฐ€ ์™œ ์ด๋ ‡๊ฒŒ ์–ด๋ ค์šธ๊นŒ?โ€
    • ๋ฐฉ๋ฒ•๋ก : ์ด์ค‘ ์ตœ์ ํ™”์˜ ์•„๋ฆ„๋‹ค์›€
      • ๋ฌธ์ œ ์ •์˜: ํŒŒ์ง€ ํ•ฉ์„ฑ์ด๋ž€ ๋ฌด์—‡์ธ๊ฐ€
      • ๊ธฐ์กด ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋“ค์˜ ๋ฌธ์ œ
      • BODex์˜ ํ•ต์‹ฌ: ์ด์ค‘ ์ตœ์ ํ™” (Bilevel Optimization)
      • ๊ทธ๋ž˜๋””์–ธํŠธ๋Š” ์–ด๋–ป๊ฒŒ ํ๋ฅด๋‚˜? (์•”๋ฌต์  ๋ฏธ๋ถ„)
      • GPU ๋ณ‘๋ ฌํ™”: ์†๋„์˜ ๋น„๋ฐ€
      • ํ•ฉ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ: Coarse-to-Fine
      • ๋ฐ์ดํ„ฐ์…‹ ๊ทœ๋ชจ
    • ์‹คํ—˜: ์ˆซ์ž๊ฐ€ ๋งํ•˜๋Š” ๊ฒƒ๋“ค
      • ํ‰๊ฐ€ ์ง€ํ‘œ
      • ์—๋„ˆ์ง€ ํ•จ์ˆ˜ ๋น„๊ต
      • ๋ฌผ์ฒด ํฌ๊ธฐ์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ
      • Ablation Study: ๊ฐ ๊ตฌ์„ฑ ์š”์†Œ์˜ ๊ธฐ์—ฌ
      • ํ•™์Šต ๋ชจ๋ธ ํ’ˆ์งˆ: BODex ๋ฐ์ดํ„ฐ์…‹ vs DexGraspNet
    • ๊ด€๋ จ ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต
      • DexGraspNet (Wan et al., 2023)
      • FRoGGeR (Liu et al., 2023)
      • SpringGrasp (Chen et al., 2024)
      • GraspQP (2025, ๋™์‹œ๋Œ€ ์—ฐ๊ตฌ)
      • Dexonomy (2025)
    • ๋น„ํŒ์  ๊ณ ์ฐฐ: ๊ฐ•์ ๊ณผ ํ•œ๊ณ„
      • ๊ฐ•์ 
      • ํ•œ๊ณ„ ๋ฐ ํ–ฅํ›„ ๊ณผ์ œ
    • ์•Œ๋ ˆ๊ทธ๋กœ ํ•ธ๋“œ ์—ฐ๊ตฌ์ž๋ฅผ ์œ„ํ•œ ๋…ธํŠธ
    • ์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 
    • ์ฐธ๊ณ  ์ž๋ฃŒ

๐Ÿ“ƒBODex ๋ฆฌ๋ทฐ

qp
mujoco
grasp
Scalable and Efficient Robotic Dexterous Grasp Synthesis Using Bilevel Optimization
Published

February 25, 2026

  • Paper Link
  • Code Link + Benchmark
  • Project Link
  1. ๐Ÿš€ ์ด ์—ฐ๊ตฌ๋Š” Bilevel Optimization๊ณผ GPU ๊ฐ€์†ํ™”๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋กœ๋ด‡ Dexterous Grasp Synthesis๋ฅผ ์œ„ํ•œ ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๊ณ  ํšจ์œจ์ ์ธ ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐœํ–ˆ์Šต๋‹ˆ๋‹ค.
  2. ๐Ÿ“ˆ ์ œ์•ˆ๋œ BODex ์‹œ์Šคํ…œ์€ ๊ธฐ์กด ๋ถ„์„ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋ก  ๋Œ€๋น„ Grasp Quality์™€ ํ•ฉ์„ฑ ์†๋„์—์„œ ์šฐ์›”ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋ฉฐ, MuJoCo ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ 75% ์ด์ƒ์˜ ์„ฑ๊ณต๋ฅ ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.
  3. ๐Ÿ’ก ๋˜ํ•œ, BODex๋กœ ์ƒ์„ฑ๋œ ๋Œ€๊ทœ๋ชจ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹์€ Learning Model์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ์œผ๋ฉฐ, Shadow Hand๋ฅผ ์ด์šฉํ•œ ์‹ค์ œ ํ™˜๊ฒฝ ํ…Œ์ŠคํŠธ์—์„œ 81%์˜ ์„ฑ๊ณต๋ฅ ์„ ๊ธฐ๋กํ•˜์—ฌ ๊ทธ ํšจ๊ณผ๋ฅผ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ” Ping Review

๐Ÿ” Ping โ€” A light tap on the surface. Get the gist in seconds.

BODex: Scalable and Efficient Robotic Dexterous Grasp Synthesis Using Bilevel Optimization ๋…ผ๋ฌธ์€ ๊ณ ์ž์œ ๋„(high-DoF) dexterous hand์˜ grasp ํ•ฉ์„ฑ์„ ์œ„ํ•œ ํ™•์žฅ์„ฑ ์žˆ๊ณ  ํšจ์œจ์ ์ธ ํŒŒ์ดํ”„๋ผ์ธ๊ณผ ํฌ๊ด„์ ์ธ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์€ ๋น„ํšจ์œจ์„ฑ, ๊ฐ•ํ•œ ๊ฐ€์ •(์˜ˆ: ๋“ฑ๊ฐ€ ์ ‘์ด‰๋ ฅ, ๋งˆ์ฐฐ ์—†์Œ), ์ œํ•œ๋œ ๊ฐ์ฒด ์„ธํŠธ ๋“ฑ์˜ ํ•œ๊ณ„๋ฅผ ๊ฐ€์กŒ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ๋ณธ ์—ฐ๊ตฌ๋Š” grasp ํ•ฉ์„ฑ์„ ์ด์ค‘ ๋ ˆ๋ฒจ ์ตœ์ ํ™”(bilevel optimization) ๋ฌธ์ œ๋กœ ์ •์‹ํ™”ํ•˜๊ณ , MuJoCo ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ตฌ์ถ•ํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ๋ฐฉ๋ฒ•๋ก 

๋ณธ ์—ฐ๊ตฌ์˜ ํ•ต์‹ฌ์€ grasp ํ•ฉ์„ฑ์„ ์ด์ค‘ ๋ ˆ๋ฒจ ์ตœ์ ํ™” ๋ฌธ์ œ๋กœ ์ •์˜ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

1. ์ด์ค‘ ๋ ˆ๋ฒจ ์ตœ์ ํ™” ์ •์‹ํ™”

  • ๋ชฉ์ : ์ฃผ์–ด์ง„ ๊ฐ์ฒด์— ๋Œ€ํ•ด ์•ˆ์ •์ ์ธ grasp pose x (๋ฃจํŠธ ํšŒ์ „, ๋ณ€ํ™˜ ๋ฐ ๊ด€์ ˆ ๊ฐ๋„ ํฌํ•จ)๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. grasp pose๋Š” ๊ฐ์ฒด์™€์˜ ์ ‘์ด‰์„ ์œ ์ง€ํ•˜๊ณ , ์™ธ๋ถ€ wrench์— ์ €ํ•ญํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ ์˜ ์† ์ž์„ธ์ž…๋‹ˆ๋‹ค.
  • ์ƒ์œ„ ๋ ˆ๋ฒจ ์ตœ์ ํ™”(Upper-level Optimization):
    • ๋ชฉ์  ํ•จ์ˆ˜: \min_{x, y_j, j \in \{1, \ldots, s\}} \sum_{j=1}^s Q_j(x)
      • ์—ฌ๊ธฐ์„œ s๋Š” ๋ชฉํ‘œ wrench ๋ฐฉํ–ฅ์˜ ๊ฐœ์ˆ˜์ž…๋‹ˆ๋‹ค.
      • Q_j(x)๋Š” ํ•˜์œ„ ๋ ˆ๋ฒจ ์ตœ์ ํ™” ๋ฌธ์ œ์˜ ํ•ด๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ํ˜„์žฌ ์† ์ž์„ธ x์—์„œ ํŠน์ • ๋ชฉํ‘œ wrench t_j์— ๋Œ€ํ•ด ์†์ด ์–ผ๋งˆ๋‚˜ ์ž˜ ์ €ํ•ญํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๊ฐ’์ž…๋‹ˆ๋‹ค. ์ด ๊ฐ’์„ ์ตœ์†Œํ™”ํ•จ์œผ๋กœ์จ ๋‹ค์–‘ํ•œ ๋ฐฉํ–ฅ์˜ ์™ธ๋ ฅ์— ๋Œ€ํ•ด ์•ˆ์ •์ ์ธ grasp pose๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค.
    • ์ œ์•ฝ ์กฐ๊ฑด:
      • x_{\min} \le x \le x_{\max}: ๋กœ๋ด‡ ์†์˜ ๊ฐ ๊ด€์ ˆ ๋ฐ ๋ฃจํŠธ ์ž์„ธ๊ฐ€ ๋ฌผ๋ฆฌ์ ์ธ ํ•œ๊ณ„ ๋‚ด์— ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.
      • c_{i,w} = FK(x, c_{i,l}) \in \partial O: ์†์˜ i๋ฒˆ์งธ ์˜ˆ์ƒ ์ ‘์ด‰์  c_{i,w} (์›”๋“œ ์ขŒํ‘œ๊ณ„)๊ฐ€ ๊ฐ์ฒด ํ‘œ๋ฉด \partial O๊ณผ ์ ‘์ด‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ FK๋Š” ์ „๋ฐฉ ์šด๋™ํ•™(forward kinematics) ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
      • No (hand-hand/hand-object) collision: ์† ์ž์ฒด์™€ ์†-๊ฐ์ฒด ๊ฐ„์˜ ์ถฉ๋Œ์ด ์—†์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ํ•˜์œ„ ๋ ˆ๋ฒจ ์ตœ์ ํ™”(Lower-level Optimization) - Q_j(x):
    • ์ฃผ์–ด์ง„ ์† ์ž์„ธ x์™€ ๋ชฉํ‘œ wrench t_j์— ๋Œ€ํ•ด, ์†์ด ๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ ์˜ ์ ‘์ด‰๋ ฅ y_j = [f_{j,1}, \ldots, f_{j,m}] (์—ฌ๊ธฐ์„œ m์€ ์ ‘์ด‰์ ์˜ ์ˆ˜)์„ ์ฐพ๋Š” ์ด์ฐจ ๊ณ„ํš(Quadratic Programming, QP) ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค.
    • Q_j(x) \triangleq \min_{y_j} \left\| \beta t_j - \sum_{i=1}^m G_i f_{j,i} \right\|^2
      • \beta: ๋ชฉํ‘œ wrench์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜๋Š” ์–‘์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์ž…๋‹ˆ๋‹ค.
      • t_j: ๋ชฉํ‘œ wrench์˜ ๋‹จ์œ„ ๋ฒกํ„ฐ์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, force-closure grasp์˜ ๊ฒฝ์šฐ, 6๊ฐœ์˜ ์ฃผ์š” ๋ฐฉํ–ฅ(์˜ˆ: [1,0,0,0,0,0] ๋ฐ [-1,0,0,0,0,0])์— ๋Œ€ํ•œ ๋‹จ์œ„ ๋ฒกํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
      • G_i \in \mathbb{R}^{6 \times 3}: i๋ฒˆ์งธ ์ ‘์ด‰์ ์˜ grasp matrix๋กœ, ํ•ด๋‹น ์ ‘์ด‰๋ ฅ f_{j,i}๋ฅผ ๊ฐ์ฒด์— ๊ฐ€ํ•ด์ง€๋Š” wrench w_i = G_i f_{j,i}๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. G_i = \begin{bmatrix} n_i & d_i & e_i \\ p_i \times n_i & p_i \times d_i & p_i \times e_i \end{bmatrix}๋กœ ์ •์˜๋˜๋ฉฐ, n_i๋Š” ์ ‘์ด‰ ๋ฒ•์„ , d_i, e_i๋Š” ์ ‘์„  ๋ฒกํ„ฐ, p_i๋Š” ์ ‘์ด‰ ์œ„์น˜์ž…๋‹ˆ๋‹ค.
    • ์ œ์•ฝ ์กฐ๊ฑด:
      • f_{j,i} \in F_i, \quad i \in \{1, \ldots, m\}: ๊ฐ ์ ‘์ด‰๋ ฅ f_{j,i}๋Š” ๋งˆ์ฐฐ ์›๋ฟ”(friction cone) F_i ๋‚ด์— ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. F_i = \left\{f_i \in \mathbb{R}^3 \mid 0 \le f_{i,1} \le 1, f_{i,2}^2 + f_{i,3}^2 \le \mu^2 f_{i,1}^2\right\}๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค. ์ด์ฐจ ์ œ์•ฝ ์กฐ๊ฑด์„ ์„ ํ˜• ์ œ์•ฝ ์กฐ๊ฑด์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ ์œ„ํ•ด, ์ด ํƒ€์›ํ˜• ๋งˆ์ฐฐ ์›๋ฟ”์€ 8-vertex ํ”ผ๋ผ๋ฏธ๋“œํ˜• ์›๋ฟ”๋กœ ๊ทผ์‚ฌ๋ฉ๋‹ˆ๋‹ค.
      • \sum_{i=1}^m f_{j,i,1} \ge \gamma: ์†์ด ๊ฐ์ฒด์— ์ตœ์†Œํ•œ์˜ ์••๋ ฅ์„ ๊ฐ€ํ•˜๋„๋ก ํ•˜์—ฌ y_j = 0๊ณผ ๊ฐ™์€ ์ž๋ช…ํ•œ ํ•ด๋ฅผ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค. \gamma๋Š” ์–‘์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์ž…๋‹ˆ๋‹ค.
    • Q_j(x)๋Š” ์†์ด ๋ชฉํ‘œ wrench๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ์ƒ์œ„ ๋ ˆ๋ฒจ์—์„œ๋Š” Q_j(x)๊ฐ€ x์— ๋Œ€ํ•ด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•˜์—ฌ ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•(gradient descent)์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

2. Bilevel ์ตœ์ ํ™” ํ•ด๊ฒฐ ๊ณผ์ •

  • ๋ณ‘๋ ฌํ™” ๋ฐ ๊ฐ€์†ํ™”: ๊ฐ ์ƒ์œ„ ๋ ˆ๋ฒจ ๋ฐ˜๋ณต์—์„œ, cuRobo [7]์˜ ์ „๋ฐฉ ์šด๋™ํ•™(FK)์„ ์‚ฌ์šฉํ•˜์—ฌ ์† ์ž์„ธ x๋กœ๋ถ€ํ„ฐ ๊ฐ ์†๊ฐ€๋ฝ ๋งํฌ์˜ ๋ณ€ํ™˜ R_i, T_i๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์›”๋“œ ์ขŒํ‘œ๊ณ„์—์„œ์˜ ์˜ˆ์ƒ ์ ‘์ด‰์  c_{i,w}๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค. ๊ฐ์ฒด์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์  p_i์™€ ๋ฒ•์„  n_i๋ฅผ ์งˆ์˜ํ•˜์—ฌ Grasp Matrix G_i๋ฅผ ๊ตฌ์„ฑํ•œ ํ›„, ํ•˜์œ„ ๋ ˆ๋ฒจ QP๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด QP๋“ค์€ PyTorch ๊ธฐ๋ฐ˜์˜ GPU ๊ฐ€์† ADMM ์†”๋ฒ„์ธ ReLU-QP [8]์˜ ๋ฐฐ์น˜(batched) ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜์—ฌ GPU์—์„œ ๋ณ‘๋ ฌ๋กœ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ณ„์‚ฐ ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ํฌ๊ฒŒ ์ค„์ž…๋‹ˆ๋‹ค.
  • ์ถฉ๋Œ ๋ฐ ๊ด€์ ˆ ์ œ์•ฝ: ์ƒ์œ„ ๋ ˆ๋ฒจ์˜ ๊ด€์ ˆ ํ•œ๊ณ„, ์ž๊ฐ€ ์ถฉ๋Œ(self-penetration), ์ƒํ˜ธ ์ถฉ๋Œ(inter-penetration)๊ณผ ๊ฐ™์€ ์ œ์•ฝ ์กฐ๊ฑด์€ cuRobo์˜ ํ•ด๋‹น ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค. cuRobo๋Š” ๋กœ๋ด‡ ๋ฃจํŠธ์˜ 6-DoF ์ƒํƒœ๋ฅผ ์ตœ์ ํ™” ๊ฐ€๋Šฅํ•œ ๋ณ€์ˆ˜๋กœ ํฌํ•จํ•  ์ˆ˜ ์žˆ๋„๋ก ์ˆ˜์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

3. Coarse-to-fine ์ ‘์ด‰ ๋ชจ๋ธ๋ง

  • cuRobo์˜ ๊ตฌ(sphere) ๊ธฐ๋ฐ˜ ์ ‘์ด‰ ๋ชจ๋ธ์€ ๋น ๋ฅด์ง€๋งŒ, ์ •๋ฐ€ํ•œ grasp ํ•ฉ์„ฑ์„ ์œ„ํ•œ ์ •ํ™•๋„๊ฐ€ ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด coarse-to-fine ์ „๋žต์ด ์ œ์•ˆ๋ฉ๋‹ˆ๋‹ค.
  • Coarse Stage (300ํšŒ ๋ฐ˜๋ณต): ๋กœ๋ด‡ ์†์˜ ํ˜•์ƒ์„ ๊ตฌ๋กœ ๊ทผ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ ‘์ด‰์  c_{i,l}์€ ๊ฐ ์†๊ฐ€๋ฝ ๋์˜ ์ฒซ ๋ฒˆ์งธ ๊ตฌ์˜ ์ค‘์‹ฌ์œผ๋กœ ์„ค์ •๋˜๋ฉฐ, ๊ฑฐ๋ฆฌ ์—๋„ˆ์ง€ E_d^c = \sum_{i=1}^m (\|c_{i,w} - p_i\| - \alpha)^2๋ฅผ ์ตœ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • Fine Stage (pre-grasp 100ํšŒ, grasp 100ํšŒ ๋ฐ˜๋ณต): ์‹ค์ œ ์ถฉ๋Œ ๋ฉ”์‹œ(collision meshes)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ •๋ฐ€ํ•œ ์ ‘์ด‰ ๋ชจ๋ธ๋ง์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. GJK ์•Œ๊ณ ๋ฆฌ์ฆ˜ [29]์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ์†๊ฐ€๋ฝ ๋๊ณผ ๊ฐ์ฒด ์‚ฌ์ด์˜ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ ์„ ์ฐพ์Šต๋‹ˆ๋‹ค. GJK ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋น„๋ฏธ๋ถ„์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ์ƒˆ๋กœ์šด ๊ฑฐ๋ฆฌ ์—๋„ˆ์ง€ E_d^f = \sum_{i=1}^m \|c_{i,w}^{f'} - p_i^f\|^2์™€ ์ˆ˜์ •๋œ grasp ์—๋„ˆ์ง€ Q' = \sum_{i=1}^m \|c_{i,w}^{f'} - p_i^c\|^2๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ c_{i,w}^{f'} = R_i \text{Detach}(c_{i,l}^f) + T_i๋กœ ์ •์˜ํ•˜์—ฌ c_{i,w}^{f'}๊ฐ€ R_i์™€ T_i์— ๋Œ€ํ•ด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

4. Pre-grasp ๋ฐ Collision-Free Hand-Arm Trajectory Synthesis

  • ๊ฐ์ฒด์™€์˜ ์ถฉ๋Œ์„ ํ”ผํ•˜๊ณ  ํž˜์„ ๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ์ดˆ๊ธฐ ์ž์„ธ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด, ๊ฐ์ฒด๋กœ๋ถ€ํ„ฐ ์ตœ์†Œ 1cm ๋–จ์–ด์ง„ pre-grasp pose x_p๋ฅผ ํ•ฉ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ์‹ค์ œ ํž˜์„ ๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ squeeze pose x_s๋Š” x_s = 2x - x_p๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค. ์ด ์ž์„ธ๊ฐ€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ฐ ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ ์‹คํ–‰ ๋ชฉํ‘œ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ์ „์ฒด ์ตœ์ ํ™”๋Š” ์„ธ ๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค: Coarse Stage (300ํšŒ ๋ฐ˜๋ณต, ์ถฉ๋Œ ๊ตฌ, ๊ฐ์ฒด ๊ฑฐ๋ฆฌ 1cm ์ค„์ž„), Fine Stage (pre-grasp x_p ์ƒ์„ฑ, 100ํšŒ ๋ฐ˜๋ณต, ์ถฉ๋Œ ๋ฉ”์‹œ, Q' ์—๋„ˆ์ง€ ์‚ฌ์šฉ, ๊ฑฐ๋ฆฌ ์ค„์ž„ ์—†์Œ), Final Stage (grasp pose x ์ƒ์„ฑ, 100ํšŒ ๋ฐ˜๋ณต, Fine Stage์™€ ์œ ์‚ฌํ•˜๋‚˜ ๊ฑฐ๋ฆฌ ์ค„์ž„ ์—†์Œ).

์‹คํ—˜ ๊ฒฐ๊ณผ

  • ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ ๋ฐ ๊ฐ์ฒด: MuJoCo ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ์™€ Shadow Hand, Allegro Hand๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, DexGraspNet [4]์˜ ๊ฐ์ฒด ์ž์‚ฐ(์ด 2,397๊ฐœ ๊ฐ์ฒด์— 4๊ฐ€์ง€ ํฌ๊ธฐ๋ฅผ ์ ์šฉํ•˜์—ฌ 9,588๊ฐœ ๊ฐ์ฒด)์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ํ‰๊ฐ€ ์ง€ํ‘œ: ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๊ณต๋ฅ (Simulation Success Rate, SSR), ์†๋„(Speed, S), ๊ด€ํ†ต ๊นŠ์ด(Penetration Depth, PD), ์ž๊ฐ€ ๊ด€ํ†ต ๊นŠ์ด(Self-Penetration Depth, SPD), ์ ‘์ด‰ ๊ฑฐ๋ฆฌ ์ผ๊ด€์„ฑ(Contact Distance Consistency, CDC), ์ฒซ ๋ฒˆ์งธ ๋ถ„์‚ฐ ๋น„์œจ(First Variance Ratio, FVR)์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ๋ถ„์„ ๊ธฐ๋ฐ˜ ํ•ฉ์„ฑ ๋ฒค์น˜๋งˆํ‚น: DexGraspNet (DGN), SpringGrasp, FRoGGeR ๋“ฑ ๊ธฐ์กด ํŒŒ์ดํ”„๋ผ์ธ๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ, ๋ณธ ์—ฐ๊ตฌ์˜ ํŒŒ์ดํ”„๋ผ์ธ์€ ๊ฑฐ์˜ ๋ชจ๋“  ์ง€ํ‘œ์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ ํŠนํžˆ SSR๊ณผ ์†๋„์—์„œ ํฐ ํ–ฅ์ƒ(์ตœ๋Œ€ 50๋ฐฐ)์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ QP ๊ธฐ๋ฐ˜ ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋Š” DFC, TDG ๋“ฑ ๊ธฐ์กด ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋ณด๋‹ค ๋†’์€ SSR์„ ๋‹ฌ์„ฑํ•˜๋ฉฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฒฐ๊ณผ์™€ ๋” ๋†’์€ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. Coarse-to-fine ๋ฐ pre-grasp ์ „๋žต์€ SSR, PD, CDC๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค์ง€๋งŒ ์†๋„๋Š” ๊ฐ์†Œํ•ฉ๋‹ˆ๋‹ค.
  • ํ•™์Šต ๊ธฐ๋ฐ˜ ํ•ฉ์„ฑ ๋ฒค์น˜๋งˆํ‚น: ISAGrasp, GraspTTA, 3D Diffusion policy, UnidexGrasp ๋“ฑ 4๊ฐ€์ง€ ํ•™์Šต ์•„ํ‚คํ…์ฒ˜๋ฅผ ๋ฒค์น˜๋งˆํ‚นํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ์€ ๊ธฐ์กด DexGraspNet ๋“ฑ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ๋ณด๋‹ค ์ผ๊ด€์ ์œผ๋กœ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค (์˜ˆ: UDG ๋ชจ๋ธ์˜ SSR์ด DexGraspNet ๋ฐ์ดํ„ฐ์…‹์—์„œ๋Š” 40%๋Œ€์˜€์œผ๋‚˜ ๋ณธ ์—ฐ๊ตฌ์˜ ๋ฐ์ดํ„ฐ์…‹์—์„œ๋Š” 80%๋Œ€๋กœ ํ–ฅ์ƒ). ๋ฐ์ดํ„ฐ์…‹ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ฆด์ˆ˜๋ก ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋จ์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ์‹ค์ œ ํ™˜๊ฒฝ ์‹คํ—˜: UR10e ๋กœ๋ด‡ ํŒ”์— Shadow Hand๋ฅผ ์žฅ์ฐฉํ•˜๊ณ  Azure Kinect ์„ผ์„œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 20๊ฐœ ๊ฐ์ฒด์— ๋Œ€ํ•ด ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต๋œ ๋ชจ๋ธ์€ 81%์˜ ์ „๋ฐ˜์ ์ธ ์„ฑ๊ณต๋ฅ ์„ ๋‹ฌ์„ฑํ•˜์—ฌ ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ์˜ ํšจ๊ณผ๋ฅผ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค. ์–‡๊ณ  ํ‰ํ‰ํ•œ ๊ฐ์ฒด์—์„œ grasp์ด ์•ฝ๊ฐ„ ๋น—๋‚˜๊ฐ€๊ฑฐ๋‚˜ ๋„ˆ๋ฌด ๋„“๊ฒŒ ์˜ˆ์ธก๋˜๋Š” ์‹คํŒจ ์‚ฌ๋ก€๊ฐ€ ๊ด€์ฐฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

ํ•œ๊ณ„ ๋ฐ ๊ฒฐ๋ก 

  • ํ•œ๊ณ„: ๋ณธ ์—ฐ๊ตฌ์˜ ํŒŒ์ดํ”„๋ผ์ธ์€ ์ฃผ๋กœ ์†๊ฐ€๋ฝ ๋ ์ ‘์ด‰์— ์˜์กดํ•˜๋ฉฐ ์†๋ฐ”๋‹ฅ ์ ‘์ด‰์„ ํ™œ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋ณต์žกํ•œ ๊ตฐ์ง‘ ์žฅ๋ฉด(cluster scene)์—์„œ์˜ grasping์ด๋‚˜ ๊ธฐ๋Šฅ์ ์ธ(functional) grasp (์˜ˆ: ๋ฌธ ์—ด๊ธฐ)๋Š” ๋‹ค๋ฃจ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ƒ์„ฑ๋œ ์ถฉ๋Œ ์—†๋Š” ๊ถค์ ์€ ์•„์ง ํ์‡„ ๋ฃจํ”„ ์‹œ๊ฐ ์ •์ฑ… ํ•™์Šต์— ํ™œ์šฉ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
  • ๊ฒฐ๋ก : ๋ณธ ์—ฐ๊ตฌ๋Š” ๋กœ๋ด‡ dexterous grasp ํ•ฉ์„ฑ์„ ์œ„ํ•œ ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๊ณ  ํšจ์œจ์ ์ธ ํŒŒ์ดํ”„๋ผ์ธ์„ ์ œ์‹œํ–ˆ์œผ๋ฉฐ, ์ด๋Š” ๋Œ€๊ทœ๋ชจ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•์„ ์šฉ์ดํ•˜๊ฒŒ ํ•˜๊ณ  ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ grasp ํ•ฉ์„ฑ ๋ฐฉ๋ฒ•๋ก ์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค. MuJoCo๋ฅผ ํ†ตํ•œ ํฌ๊ด„์ ์ธ ๋ฒค์น˜๋งˆํฌ๋Š” ๋ณธ ํŒŒ์ดํ”„๋ผ์ธ๊ณผ ๋ฐ์ดํ„ฐ์…‹์˜ ์šฐ์ˆ˜์„ฑ์„ ์ž…์ฆํ–ˆ์œผ๋ฉฐ, ์‹ค์ œ ํ™˜๊ฒฝ ์‹คํ—˜์„ ํ†ตํ•ด ๊ทธ ์ž ์žฌ๋ ฅ์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”” Ring Review

๐Ÿ”” Ring โ€” An idea that echoes. Grasp the core and its value.

์„œ๋ก : โ€œ์†์ด 20๊ฐœ์˜ ๊ด€์ ˆ์ด๋ฉด ํŒŒ์ง€๊ฐ€ ์™œ ์ด๋ ‡๊ฒŒ ์–ด๋ ค์šธ๊นŒ?โ€

๋กœ๋ด‡ ํŒŒ์ง€(grasping)๋ฅผ ์ฒ˜์Œ ๊ณต๋ถ€ํ•˜๋Š” ์‚ฌ๋žŒ์—๊ฒŒ ์ด๋Ÿฐ ์งˆ๋ฌธ์„ ๋˜์ ธ๋ณด์ž. โ€œ๋ฌผ๋ณ‘์„ ์ง‘๋Š” ๊ฒŒ ์–ด๋ ต๋‚˜์š”?โ€ ๋ฌผ๋ก  ์‰ฝ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ๋กœ๋ด‡์—๊ฒŒ๋Š”? ์•„์ด๋Ÿฌ๋‹ˆํ•˜๊ฒŒ๋„, ์†๊ฐ€๋ฝ์ด ๋งŽ์„์ˆ˜๋ก โ€” ์ฆ‰, ๋” ์ธ๊ฐ„๊ณผ ๋‹ฎ์•„์žˆ์„์ˆ˜๋ก โ€” ๋กœ๋ด‡์—๊ฒŒ๋Š” ๋” ์–ด๋ ค์šด ๋ฌธ์ œ๊ฐ€ ๋œ๋‹ค.

๋ณ‘๋ ฌ ๊ทธ๋ฆฌํผ(parallel gripper)๋Š” ์ž์œ ๋„๊ฐ€ 1~2๊ฐœ์— ๋ถˆ๊ณผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ˆ˜์ฒœ ๊ฐœ์˜ ๋ฌด์ž‘์œ„ ํฌ์ฆˆ๋ฅผ ์ƒ˜ํ”Œ๋งํ•œ ๋’ค ํ’ˆ์งˆ ์ง€ํ‘œ๋กœ ํ•„ํ„ฐ๋งํ•˜๋Š” ๋ฐฉ์‹์ด ์ž˜ ์ž‘๋™ํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ Shadow Hand๋‚˜ Allegro Hand์ฒ˜๋Ÿผ 20๊ฐœ ์ด์ƒ์˜ ์ž์œ ๋„(DoF)๋ฅผ ๊ฐ€์ง„ ๋ฑ์Šคํ…Œ๋Ÿฌ์Šค ํ•ธ๋“œ๋Š” ์ด์•ผ๊ธฐ๊ฐ€ ์™„์ „ํžˆ ๋‹ฌ๋ผ์ง„๋‹ค. ์ž์œ ๋„๊ฐ€ 20๊ฐœ๋ผ๋ฉด, ํƒ์ƒ‰ํ•ด์•ผ ํ•  ๊ณต๊ฐ„์€ ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ํญ๋ฐœํ•œ๋‹ค. ๋ฌด์ž‘์œ„ ์ƒ˜ํ”Œ๋ง? ๋์—†๋Š” ์šฐ์ฃผ๋ฅผ ๋‹คํŠธ ํ•˜๋‚˜๋กœ ๋งž์ถ”๋Š” ๊ฒฉ์ด๋‹ค.

๊ทธ๋ ‡๋‹ค๋ฉด ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์€ ์–ด๋–ป๊ฒŒ ์ ‘๊ทผํ–ˆ์„๊นŒ?

๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ํ•™์Šต(Learning-based) ๋ฐฉ๋ฒ•๋“ค์€ ์–ด๋А ์ •๋„ ์„ฑ๊ณผ๋ฅผ ๊ฑฐ๋’€์ง€๋งŒ, ํ•™์Šต์— ํ•„์š”ํ•œ ๋Œ€๊ทœ๋ชจ, ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹์ด ์ ˆ๋Œ€์ ์œผ๋กœ ๋ถ€์กฑํ–ˆ๋‹ค. ์ธ๊ฐ„์ด ์ง์ ‘ ๋ฌผ์ฒด๋ฅผ ์žก๋Š” ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜ ๋ฐฉ์‹(RealDex)์€ 52๊ฐœ ๋ฌผ์ฒด์— ๋Œ€ํ•ด 5๋งŒ 9์ฒœ ๊ฐœ ํŒŒ์ง€๋ฐ–์— ์ƒ์„ฑํ•˜์ง€ ๋ชปํ–ˆ๋‹ค. ๋„ˆ๋ฌด ๋А๋ฆฌ๊ณ , ํ™•์žฅ์ด ์–ด๋ ต๋‹ค.

๊ทธ๋ž˜๋””์–ธํŠธ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™”๋Š” ๋‹ค๋ฅธ ์ ‘๊ทผ์ด์—ˆ๋‹ค. ํŒŒ์ง€ ํฌ์ฆˆ๋ฅผ ์—๋„ˆ์ง€ ํ•จ์ˆ˜์˜ ์ตœ์†Ÿ๊ฐ’์œผ๋กœ ์ฐพ์•„๋‚ด๋Š” ๋ฐฉ์‹์ธ๋ฐ, ์—ฌ๊ธฐ์„œ โ€œ์—๋„ˆ์ง€(energy)โ€๋ž€ ํŒŒ์ง€ ํ’ˆ์งˆ์„ ์ˆ˜์น˜๋กœ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋‹ค. ์ด ๋ฐฉํ–ฅ์ด ์œ ๋งํ•ด ๋ณด์ด์ง€๋งŒ, ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์—๋Š” ์„ธ ๊ฐ€์ง€ ์น˜๋ช…์ ์ธ ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ๋‹ค.

  1. ๋ฌผ๋ฆฌ์  ๊ฐ€์ •์˜ ์˜ค๋ฅ˜: ๋ชจ๋“  ์ ‘์ด‰์ ์—์„œ ํž˜์ด ๊ท ๋“ฑํ•˜๋‹ค๊ฑฐ๋‚˜, ๋งˆ์ฐฐ์ด ์—†๋‹ค๋Š” ๊ฐ€์ • โ€” ํ˜„์‹ค๊ณผ ๋™๋–จ์–ด์ง„ ๋‹จ์ˆœํ™”
  2. ์†๋„ ๋ฌธ์ œ: ๊ธฐ์กด ์‹œ์Šคํ…œ๋“ค์€ ๋ฐ์ดํ„ฐ์…‹์„ ์ˆ˜๋ฐฑ๋งŒ ๊ฑด ๊ทœ๋ชจ๋กœ ์ƒ์„ฑํ•˜๊ธฐ์—” ๋„ˆ๋ฌด ๋А๋ ธ๋‹ค
  3. ๋ฒค์น˜๋งˆํฌ ๋ถ€์žฌ: ๋ฐฉ๋ฒ•๋“ค ๊ฐ„์— ๊ณต์ •ํ•œ ๋น„๊ต ๊ธฐ์ค€์ด ์—†์—ˆ๋‹ค

BODex๋Š” ์ด ์„ธ ๋ฌธ์ œ๋ฅผ ํ•œ๊บผ๋ฒˆ์— ์ •๋ฉด์œผ๋กœ ํ•ด๊ฒฐํ•œ๋‹ค. ํ•ต์‹ฌ ์•„์ด๋””์–ด๋Š” ๊ฐ„๊ฒฐํ•˜๋‹ค: ์ด์ค‘ ์ตœ์ ํ™”(Bilevel Optimization)๋ฅผ ํ†ตํ•ด ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์˜ฌ๋ฐ”๋ฅธ ํŒŒ์ง€ ์—๋„ˆ์ง€๋ฅผ ์ •์˜ํ•˜๊ณ , GPU ๊ฐ€์† QP(Quadratic Programming) ์†”๋ฒ„๋กœ ์ด๋ฅผ ์—„์ฒญ๋‚œ ์†๋„๋กœ ๋ณ‘๋ ฌํ™”ํ•œ๋‹ค. ๊ฒฐ๊ณผ๋Š” ๋†€๋ž๋‹ค โ€” ๋‹จ์ผ RTX 3090 GPU๋กœ ํ•˜๋ฃจ์— ์ˆ˜๋ฐฑ๋งŒ ๊ฑด์˜ ํŒŒ์ง€๋ฅผ ํ•ฉ์„ฑํ•œ๋‹ค.


๋ฐฉ๋ฒ•๋ก : ์ด์ค‘ ์ตœ์ ํ™”์˜ ์•„๋ฆ„๋‹ค์›€

๋ฌธ์ œ ์ •์˜: ํŒŒ์ง€ ํ•ฉ์„ฑ์ด๋ž€ ๋ฌด์—‡์ธ๊ฐ€

๋ฑ์Šคํ…Œ๋Ÿฌ์Šค ํŒŒ์ง€ ํ•ฉ์„ฑ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ์ •์˜ํ•ด๋ณด์ž. ์šฐ๋ฆฌ๊ฐ€ ์ฐพ์•„์•ผ ํ•  ๊ฒƒ์€ ํŒŒ์ง€ ํฌ์ฆˆ x = [r, t, q] \in \mathbb{R}^{9+3+n}์ด๋‹ค.

  • r \in \mathbb{R}^9: ๋ฃจํŠธ(์†๋ชฉ) ํšŒ์ „ (rotation matrix ํ‘œํ˜„)
  • t \in \mathbb{R}^3: ๋ฃจํŠธ ์œ„์น˜ (translation)
  • q \in \mathbb{R}^n: ๊ด€์ ˆ ๊ฐ๋„ (n์€ ์†์˜ DoF)

์ž…๋ ฅ์€ ๋ฌผ์ฒด ๋ฉ”์‰ฌ \mathcal{O}์™€ ๊ฐ ๋งํฌ ํ”„๋ ˆ์ž„์—์„œ์˜ ๊ธฐ๋Œ€ ์ ‘์ด‰์  \{c_{i,l}\}์ด๋‹ค.

์ด๋•Œ ์šฐ๋ฆฌ๋Š” ๋‹ค์Œ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” x^*๋ฅผ ์ฐพ๊ณ  ์‹ถ๋‹ค:

x^* = \arg\min_x \; E(x) \quad \text{s.t.} \quad \text{๊ด€์ ˆ ๋ฒ”์œ„, ์ ‘์ด‰ ์กฐ๊ฑด, ์ถฉ๋Œ ์—†์Œ}

์—ฌ๊ธฐ์„œ E(x)๋ฅผ ์–ด๋–ป๊ฒŒ ์ •์˜ํ•˜๋А๋ƒ๊ฐ€ ํ•ต์‹ฌ์ด๋‹ค.

๊ธฐ์กด ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋“ค์˜ ๋ฌธ์ œ

๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์˜ ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋ฅผ ์ž ์‹œ ์‚ดํŽด๋ณด์ž.

DFC (Differentiable Force Closure)๋Š” ์ด๋Ÿฐ ๊ฐ€์ •์„ ํ–ˆ๋‹ค: โ€œ๋ชจ๋“  ์ ‘์ด‰์ ์—์„œ ํž˜์ด ๋˜‘๊ฐ™๋‹คโ€. ์™œ ์ด๊ฒŒ ๋ฌธ์ œ์ผ๊นŒ? ํ˜„์‹ค์—์„œ ์—„์ง€์†๊ฐ€๋ฝ์ด ๊ฒ€์ง€๋ณด๋‹ค ํ›จ์”ฌ ํฐ ํž˜์„ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ๊ท ๋“ฑํ•œ ํž˜์„ ๊ฐ€์ •ํ•˜๋ฉด, ์‹ค์ œ๋กœ๋Š” ์ข‹์€ ํŒŒ์ง€์ธ๋ฐ ๋‚˜์˜๋‹ค๊ณ  ํ‰๊ฐ€ํ•˜๊ฑฐ๋‚˜, ๋‚˜์œ ํŒŒ์ง€๋ฅผ ์ข‹๋‹ค๊ณ  ์˜คํŒํ•œ๋‹ค.

TDG (Task-oriented DexGrasp)๋Š” ๋งˆ์ฐฐ์„ ๋ฌด์‹œํ–ˆ๋‹ค. ๋งˆ์ฐฐ ์—†์ด๋Š” ํž˜์„ ๋ฒ•์„  ๋ฐฉํ–ฅ์œผ๋กœ๋งŒ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ๋‹ค. ํ˜„์‹ค์˜ ํŒŒ์ง€๋Š” ๋ฒ•์„ ๋ ฅ๊ณผ ๋งˆ์ฐฐ๋ ฅ์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ๋งˆ์ฐฐ์„ ๋ฌด์‹œํ•˜๋ฉด ์‹ค์ œ๋กœ ์•ˆ์ •์ ์ธ ํŒŒ์ง€๋ฅผ ๊ฑธ๋Ÿฌ๋‚ด๋ฒ„๋ฆฐ๋‹ค.

์ด๋Ÿฐ ๊ฐ€์ •๋“ค์€ ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋ฅผ ๋‹จ์ˆœํ•˜๊ฒŒ ๋งŒ๋“ค์–ด ๊ณ„์‚ฐ์€ ๋น ๋ฅด์ง€๋งŒ, ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๊ณต๋ฅ (Simulation Success Rate, SSR)๊ณผ์˜ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋‚ฎ๋‹ค. ์ฆ‰, ์—๋„ˆ์ง€๊ฐ€ ๋‚ฎ์•„๋„ ์‹ค์ œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ๋–จ์–ด์ง€๋Š” ํŒŒ์ง€๊ฐ€ ๋งŽ๋‹ค.

BODex์˜ ํ•ต์‹ฌ: ์ด์ค‘ ์ตœ์ ํ™” (Bilevel Optimization)

BODex์˜ ์ฒœ์žฌ์ ์ธ ์•„์ด๋””์–ด๋Š” โ€œํŒŒ์ง€ ํ’ˆ์งˆ ์—๋„ˆ์ง€๋ฅผ ์ง์ ‘ ์ •์˜ํ•˜๋Š” ๋Œ€์‹ , ํž˜ ํ‰ํ˜• ๋ฌธ์ œ(QP)์˜ ์ตœ์ ๊ฐ’์œผ๋กœ ์ •์˜ํ•˜์žโ€๋Š” ๊ฒƒ์ด๋‹ค.

Note์ง๊ด€: ์ด์ค‘ ์ตœ์ ํ™”๋ฅผ ์ด๋ ‡๊ฒŒ ์ƒ๊ฐํ•ด๋ณด์ž

๋‹น์‹ ์ด ํƒ์ž ์œ„์˜ ๋ฌผ๋ณ‘์„ ์žก์œผ๋ ค ํ•œ๋‹ค. ์ข‹์€ ํŒŒ์ง€๋ž€ ๋ฌด์—‡์ธ๊ฐ€?

โ€œ์–ด๋–ค ๋ฐฉํ–ฅ์—์„œ ํž˜์„ ๊ฐ€ํ•ด๋„ ๋ฌผ๋ณ‘์ด ์•ˆ ์›€์ง์ด๋Š” ๊ฒƒ.โ€ ์ด๊ฒƒ์ด force closure๋‹ค.

๋” ๊ตฌ์ฒด์ ์œผ๋กœ: ์ฃผ์–ด์ง„ ์† ํฌ์ฆˆ์—์„œ, ๊ฐ ์†๊ฐ€๋ฝ์ด ๋‚ผ ์ˆ˜ ์žˆ๋Š” ์ตœ์„ ์˜ ํž˜ ์กฐํ•ฉ์œผ๋กœ ๋ฌผ์ฒด์— ์›ํ•˜๋Š” ํž˜๊ณผ ํ† ํฌ๋ฅผ ๊ฐ€ํ•  ์ˆ˜ ์žˆ์„ ๋•Œ, ๊ทธ ํŒŒ์ง€๋Š” ์ข‹์€ ํŒŒ์ง€๋‹ค.

์ด๊ฑธ ์ˆ˜ํ•™์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด: - ์ƒ์œ„ ๋ฌธ์ œ(Upper Level): ์–ด๋–ค ์† ํฌ์ฆˆ๊ฐ€ ๊ฐ€์žฅ ์ข‹์€๊ฐ€? - ํ•˜์œ„ ๋ฌธ์ œ(Lower Level): ์ฃผ์–ด์ง„ ํฌ์ฆˆ์—์„œ, ์ตœ์„ ์˜ ํž˜ ๋ฐฐ๋ถ„์€?

์ˆ˜์‹์œผ๋กœ ์ „๊ฐœํ•˜๋ฉด:

\min_x \; E(x) = \sum_j Q_j^*(x) + E_{\text{reg}}(x)

์—ฌ๊ธฐ์„œ Q_j^*(x)๋Š” j๋ฒˆ์งธ desired wrench(ํž˜+ํ† ํฌ)์— ๋Œ€ํ•œ ํ•˜์œ„ QP์˜ ์ตœ์ ๊ฐ’์ด๋‹ค:

Q_j^*(x) = \min_{f_1, \ldots, f_m} \; \left\| \sum_{i=1}^m \begin{pmatrix} f_i \\ r_i \times f_i \end{pmatrix} - w_j \right\|^2

์—ฌ๊ธฐ์„œ: - f_i \in \mathbb{R}^3: i๋ฒˆ์งธ ์ ‘์ด‰์ ์—์„œ์˜ ํž˜ ๋ฒกํ„ฐ - r_i: ์ ‘์ด‰์ ์˜ ์œ„์น˜ ๋ฒกํ„ฐ - w_j: j๋ฒˆ์งธ ๋ชฉํ‘œ wrench (ํž˜ + ํ† ํฌ์˜ 6D ๋ฒกํ„ฐ) - ๋งˆ์ฐฐ ์›๋ฟ” ์ œ์•ฝ, ๋ฒ•์„ ๋ ฅ ์–‘์ˆ˜ ์ œ์•ฝ, ํ•ฉ์‚ฐ ํž˜ bound ์ œ์•ฝ ํฌํ•จ

๋งˆ์ฐฐ ์›๋ฟ”(friction cone)์„ ์–ด๋–ป๊ฒŒ ์ฒ˜๋ฆฌํ•˜๋‚˜? ๋งˆ์ฐฐ์„ ์ •ํ™•ํ•˜๊ฒŒ ํ‘œํ˜„ํ•˜๋ฉด ์›๋ฟ” ๋ชจ์–‘(์ด์ฐจ ์ œ์•ฝ: QCQP)์ด๋ผ ํ’€๊ธฐ ์–ด๋ ต๋‹ค. BODex๋Š” ์ด๊ฒƒ์„ 8๊ผญ์ง“์  ํ”ผ๋ผ๋ฏธ๋“œ ๊ทผ์‚ฌ๋กœ ๋ฐ”๊ฟ” ์„ ํ˜• ์ œ์•ฝ(LCQP)์œผ๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด QP ์†”๋ฒ„๊ฐ€ ํ›จ์”ฌ ๋น ๋ฅด๊ฒŒ ์ž‘๋™ํ•œ๋‹ค.

flowchart TD
    A["๋ฌผ์ฒด ๋ฉ”์‰ฌ O\n์ ‘์ด‰์  ์ดˆ๊ธฐํ™”"] --> B

    subgraph UPPER["๐Ÿ”ด ์ƒ์œ„ ์ตœ์ ํ™” (Upper-Level)"]
        B["์† ํฌ์ฆˆ x = [r, t, q]\n๊ทธ๋ž˜๋””์–ธํŠธ ์—…๋ฐ์ดํŠธ"]
        B --> C["์—๋„ˆ์ง€ ๊ณ„์‚ฐ\nE(x) = ฮฃQj*(x) + Ereg"]
        C --> D{"์ˆ˜๋ ด?"}
        D -- No --> B
    end

    subgraph LOWER["๐Ÿ”ต ํ•˜์œ„ QP (Lower-Level)"]
        E["ํ˜„์žฌ ํฌ์ฆˆ x์—์„œ\n์ ‘์ด‰์  ์œ„์น˜ยท๋ฒ•์„  ๊ณ„์‚ฐ"]
        E --> F["๊ฐ ์ ‘์ด‰์  ํž˜ ๋ฒกํ„ฐ ์ตœ์ ํ™”\n๋งˆ์ฐฐ ์›๋ฟ” + wrench ์ œ์•ฝ ๋งŒ์กฑ"]
        F --> G["์ตœ์  ํž˜ ๋ฐฐ๋ถ„ f*\nQj*(x) ๋ฐ˜ํ™˜"]
    end

    C --> E
    G --> C
    D -- Yes --> H["ํŒŒ์ง€ ํฌ์ฆˆ ํ›„๋ณด"]

    H --> I["MuJoCo ๊ฒ€์ฆ"]
    I --> J{์„ฑ๊ณต?}
    J -- Yes --> K["๋ฐ์ดํ„ฐ์…‹ ์ €์žฅ"]
    J -- No --> L["๋ฒ„๋ฆผ"]

BODex ์ด์ค‘ ์ตœ์ ํ™” ํŒŒ์ดํ”„๋ผ์ธ

๊ทธ๋ž˜๋””์–ธํŠธ๋Š” ์–ด๋–ป๊ฒŒ ํ๋ฅด๋‚˜? (์•”๋ฌต์  ๋ฏธ๋ถ„)

์ด์ค‘ ์ตœ์ ํ™”์˜ ๊ธฐ์ˆ ์  ๋‚œ๊ด€์ด ์—ฌ๊ธฐ์„œ ๋“ฑ์žฅํ•œ๋‹ค: ์ƒ์œ„ ์ตœ์ ํ™”์˜ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๊ณ„์‚ฐํ•˜๋ ค๋ฉด \frac{\partial Q_j^*(x)}{\partial x}๋ฅผ ๊ตฌํ•ด์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ Q_j^*๋Š” QP์˜ ์ตœ์ ๊ฐ’์ด๋ผ ์ง์ ‘ ๋ฏธ๋ถ„ํ•˜๊ธฐ ๊นŒ๋‹ค๋กญ๋‹ค.

ํ•ด๋ฒ•์€ KKT ์กฐ๊ฑด์„ ์ด์šฉํ•œ ์•”๋ฌต์  ๋ฏธ๋ถ„(Implicit Differentiation)์ด๋‹ค. QP์˜ ์ตœ์ ํ•ด f^*๋Š” KKT ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š”๋ฐ, ์ด ์กฐ๊ฑด์—์„œ x์— ๋Œ€ํ•œ ๋„ํ•จ์ˆ˜๋ฅผ ์—ญ์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค. ์ตœ๊ทผ์˜ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ์ตœ์ ํ™” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(differentiable QP solvers)๋“ค์ด ์ด ๊ณผ์ •์„ ์ž๋™ํ™”ํ•ด์ค€๋‹ค.

GPU ๋ณ‘๋ ฌํ™”: ์†๋„์˜ ๋น„๋ฐ€

BODex๊ฐ€ ๊ธฐ์กด ๋ฐฉ๋ฒ• ๋Œ€๋น„ ์ตœ๋Œ€ 50๋ฐฐ ๋น ๋ฅธ ์ด์œ ๋Š” ๋‘ ๊ฐ€์ง€๋‹ค:

  1. ๋ฐฐ์น˜ QP ์†”๋ฒ„ (Batched ReLU-QP): ์ˆ˜์ฒœ ๊ฐœ์˜ ํŒŒ์ง€๋ฅผ ๋™์‹œ์— GPU์—์„œ QP ํ’€์ด. ๊ธฐ์กด์—๋Š” CPU ๊ธฐ๋ฐ˜ QP๋ฅผ ์ˆœ์ฐจ ์‹คํ–‰ํ–ˆ๋‹ค.
  2. cuRobo ๊ธฐ๋ฐ˜ ์šด๋™ํ•™: NVIDIA์˜ CUDA ๊ฐ€์† ๋กœ๋ด‡ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ cuRobo๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ˆœ์šด๋™ํ•™(FK), ์ž์ฝ”๋น„์•ˆ ๊ณ„์‚ฐ์„ ๋ฐฐ์น˜๋กœ ์ฒ˜๋ฆฌ.
Tip์†๋„ ๋น„๊ต (๋‹จ์ผ RTX 3090 ๊ธฐ์ค€)
๋ฐฉ๋ฒ• ํŒŒ์ง€/์ดˆ
DexGraspNet (๊ธฐ์กด) ~1 grasps/sec (์ถ”์ •)
BODex 49+ grasps/sec

ํ•˜๋ฃจ์— ์ˆ˜๋ฐฑ๋งŒ ๊ฑด ์ƒ์„ฑ ๊ฐ€๋Šฅ โ†’ ๋ฐ์ดํ„ฐ์…‹ ๊ทœ๋ชจ์˜ ํŒจ๋Ÿฌ๋‹ค์ž„ ์ „ํ™˜

ํ•ฉ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ: Coarse-to-Fine

๋‹จ์ˆœํžˆ QP๋ฅผ ํ‘ธ๋Š” ๊ฒƒ ์™ธ์—, BODex๋Š” ์ ์ง„์  ์ •๋ฐ€ํ™”(coarse-to-fine) ์ „๋žต์„ ์‚ฌ์šฉํ•œ๋‹ค:

1๋‹จ๊ณ„ โ€” Coarse (๊ตฌ(sphere) ๊ทผ์‚ฌ)
๋น ๋ฅธ ์ตœ๊ทผ์ ‘์  ๊ณ„์‚ฐ์„ ์œ„ํ•ด ์†์˜ ๋งํฌ๋ฅผ ๊ตฌ(sphere)๋กœ ๊ทผ์‚ฌํ•œ๋‹ค. ๋ฌผ์ฒด์™€์˜ ์ ‘์ด‰, ์ถฉ๋Œ ๊ฒ€์‚ฌ๊ฐ€ ๋น ๋ฅด๋‹ค. ๋Œ€๋žต์ ์ธ ํŒŒ์ง€ ํฌ์ฆˆ๋ฅผ ๋น ๋ฅด๊ฒŒ ํƒ์ƒ‰.

2๋‹จ๊ณ„ โ€” Fine (์ •๋ฐ€ ๋ฉ”์‰ฌ)
GJK(Gilbert-Johnson-Keerthi) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์จ์„œ ์‹ค์ œ ์ถฉ๋Œ ๋ฉ”์‰ฌ๋กœ ์ •๋ฐ€ ์ถฉ๋Œ ๊ณ„์‚ฐ. ์ ‘์ด‰ ์œ„์น˜๋ฅผ ์ •ํ™•ํžˆ ๊ณ„์‚ฐํ•˜๊ณ  ํŒŒ์ง€ ํฌ์ฆˆ๋ฅผ ๋‹ค๋“ฌ๋Š”๋‹ค.

์‚ฌ์ „ ํŒŒ์ง€ ํฌ์ฆˆ (Pre-grasp)
์ตœ์ข… ํŒŒ์ง€ ํฌ์ฆˆ ์™ธ์— โ€œ์†๊ฐ€๋ฝ์„ ํŽด์„œ ๋ฌผ์ฒด์— ์ ‘๊ทผํ•˜๋Š”โ€ ์‚ฌ์ „ ํฌ์ฆˆ๋„ ํ•จ๊ป˜ ์ƒ์„ฑํ•œ๋‹ค. ์ด๋Š” ์‹ค์ œ ๋กœ๋ด‡ ์‹คํ–‰ ์‹œ ์ถฉ๋Œ ์—†๋Š” ์ง„์ž…(approach trajectory) ์ƒ์„ฑ์— ํ•„์ˆ˜๋‹ค.

sequenceDiagram
    participant I as ์ดˆ๊ธฐํ™”
    participant C as Coarse Stage<br/>(๊ตฌ ๊ทผ์‚ฌ)
    participant F as Fine Stage<br/>(GJK ๋ฉ”์‰ฌ)
    participant P as Pre-grasp<br/>์ƒ์„ฑ
    participant V as MuJoCo<br/>๊ฒ€์ฆ

    I->>C: ๋ฌด์ž‘์œ„ ์ดˆ๊ธฐ ํฌ์ฆˆ ๋ฐฐ์น˜
    C->>C: ๋น ๋ฅธ ์ถฉ๋Œ/์ ‘์ด‰ ๊ณ„์‚ฐ<br/>๋ฐฐ์น˜ QP + ๊ทธ๋ž˜๋””์–ธํŠธ ๊ฐ•ํ•˜
    C->>F: Coarse ํŒŒ์ง€ ํ›„๋ณด ์ „๋‹ฌ
    F->>F: ์ •๋ฐ€ ๋ฉ”์‰ฌ๋กœ ์ ‘์ด‰ ์ •์ œ<br/>GJK ๊ธฐ๋ฐ˜ ์ถฉ๋Œ ์ฒ˜๋ฆฌ
    F->>P: Fine ํŒŒ์ง€ ํฌ์ฆˆ ์ „๋‹ฌ
    P->>P: ์†๊ฐ€๋ฝ ํŽผ์ณ ์ ‘๊ทผ ํฌ์ฆˆ ์ƒ์„ฑ<br/>(cuRobo ๋ชจ์…˜ ๊ณ„ํš)
    P->>V: (pre-grasp, grasp, squeeze) ํŠธ๋ฆฌํ”Œ
    V-->>V: MuJoCo ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜<br/>ํŒŒ์ง€ ์„ฑ๊ณต ์—ฌ๋ถ€ ํ™•์ธ
    V-->>I: ์„ฑ๊ณต ์‹œ ์ €์žฅ / ์‹คํŒจ ์‹œ ํ๊ธฐ

BODex ํ•ฉ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ ์ƒ์„ธ

๋ฐ์ดํ„ฐ์…‹ ๊ทœ๋ชจ

BODex๊ฐ€ ์ƒ์„ฑํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ๊ธฐ์กด๊ณผ ๋น„๊ตํ•˜๋ฉด:

Table 1: ๊ธฐ์กด ๋ฐ์ดํ„ฐ์…‹๊ณผ BODex ๋น„๊ต
๋ฐ์ดํ„ฐ์…‹ ํ•ธ๋“œ ๋ฌผ์ฒด ์ˆ˜ ํŒŒ์ง€ ์ˆ˜ ํ…Œ์ด๋ธ” ์‚ฌ์ „ ํŒŒ์ง€ ์ถฉ๋Œ ์—†๋Š” ๊ถค์  ์ƒ์„ฑ ๋ฐฉ๋ฒ•
DDGdata Shadow 565 6.9K โœ— โœ— โœ— GraspIt!
DexGraspNet Shadow 5,355 1.32M โœ— โœ— โœ— ์ตœ์ ํ™”
GenDexGrasp ๋‹ค์ค‘ 58 436K โœ— โœ— โœ— ์ตœ์ ํ™”
RealDex Shadow 52 59K โœ“ โœ— 2,630 ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜
BODex-Floating Shadow 2,440 3.62M โœ— โœ“ โœ— ์ตœ์ ํ™”
BODex-Tabletop Shadow 2,440 3.41M โœ“ โœ“ 2.62M ์ตœ์ ํ™”

BODex๋Š” Allegro, LEAP Hand์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ์…‹๋„ ๋ณ„๋„๋กœ ์ œ๊ณตํ•˜๋ฉฐ, ์ „์ฒด MuJoCo์—์„œ ๊ฒ€์ฆ๋œ ํŒŒ์ง€ ์ˆ˜๋Š” ์•ฝ 4.09๋ฐฑ๋งŒ ๊ฑด์ด๋‹ค.


์‹คํ—˜: ์ˆซ์ž๊ฐ€ ๋งํ•˜๋Š” ๊ฒƒ๋“ค

ํ‰๊ฐ€ ์ง€ํ‘œ

BODex๋Š” ์ด ๋ถ„์•ผ์˜ ํ‘œ์ค€ ๋ฒค์น˜๋งˆํฌ๊ฐ€ ์—†๋‹ค๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด MuJoCo ๊ธฐ๋ฐ˜์˜ ํ†ต์ผ๋œ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ์ฃผ์š” ์ง€ํ‘œ:

  • SSR (Simulation Success Rate): MuJoCo ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ํŒŒ์ง€ ์„ฑ๊ณต ๋น„์œจ โ€” ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์ง€ํ‘œ
  • SPD (Surface Penetration Depth): ์†-๋ฌผ์ฒด ์นจํˆฌ ๊นŠ์ด (๋‚ฎ์„์ˆ˜๋ก ์ข‹์Œ)
  • CD (Contact Distance): ์†๊ณผ ๋ฌผ์ฒด ํ‘œ๋ฉด ๊ฐ„์˜ ๊ฑฐ๋ฆฌ (๋‚ฎ์„์ˆ˜๋ก ์ข‹์Œ)
  • PD (Pose Diversity): ์ƒ์„ฑ๋œ ํŒŒ์ง€ ํฌ์ฆˆ์˜ ๋‹ค์–‘์„ฑ

์—๋„ˆ์ง€ ํ•จ์ˆ˜ ๋น„๊ต

์„œ๋กœ ๋‹ค๋ฅธ ์—๋„ˆ์ง€ ์„ค๊ณ„๊ฐ€ ์‹ค์ œ ํ•ฉ์„ฑ ํ’ˆ์งˆ์— ์–ด๋–ค ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€ ๋น„๊ตํ–ˆ๋‹ค.

๊ฒฐ๊ณผ ์š”์•ฝ (Shadow Hand, SSR ๊ธฐ์ค€):

Table 2: ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋ณ„ ์„ฑ๋Šฅ ๋น„๊ต
์—๋„ˆ์ง€ ๋ฐฉ๋ฒ• ํ•ฉ์„ฑ SSR ํ‰๊ฐ€ ์ƒ๊ด€๊ด€๊ณ„
DFC (๊ท ๋“ฑ ํž˜ ๊ฐ€์ •) ๋‚ฎ์Œ ๋‚ฎ์Œ
TDG (๋งˆ์ฐฐ ๋ฌด์‹œ) ๋‚ฎ์Œ ๋‚ฎ์Œ
QP_baseline ์ค‘๊ฐ„ ์ค‘๊ฐ„
BODex QP (์ œ์•ˆ) +10% vs QP_baseline ๋†’์Œ

ํŠนํžˆ BODex์˜ ์—๋„ˆ์ง€๋Š” ์—๋„ˆ์ง€ ๊ฐ’๊ณผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๊ณต ์—ฌ๋ถ€์˜ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋†’๋‹ค โ€” ์ฆ‰, ์—๋„ˆ์ง€๊ฐ€ ๋‚ฎ์œผ๋ฉด ์‹ค์ œ๋กœ ์„ฑ๊ณตํ•  ๊ฐ€๋Šฅ์„ฑ๋„ ๋†’๋‹ค๋Š” ๋œป์ด๋‹ค. ์ด ํŠน์„ฑ์€ ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์˜ ํ’ˆ์งˆ ๋ผ๋ฒจ๋กœ ํ™œ์šฉํ•  ๋•Œ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค.

๋ฌผ์ฒด ํฌ๊ธฐ์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ

ํฅ๋ฏธ๋กœ์šด ๊ด€์ฐฐ์ด ์žˆ๋‹ค. ๋ฌผ์ฒด ํฌ๊ธฐ๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ์„ฑ๊ณต๋ฅ ์ด ๊ฐ์†Œํ•œ๋‹ค. ์ด๋Š” ์ง๊ด€์ ์œผ๋กœ๋„ ์ดํ•ด๋œ๋‹ค:

  • ์ž‘์€ ๋ฌผ์ฒด: ์†๊ฐ€๋ฝ์ด ๋ฌผ์ฒด๋ฅผ โ€œ๊ฐ์‹ธ๋Š”โ€ wrapping grasp ํ˜•ํƒœ๊ฐ€ ์‰ฝ๊ฒŒ ์„ฑ๋ฆฝ โ†’ force closure ๋‹ฌ์„ฑ ์šฉ์ด
  • ํฐ ๋ฌผ์ฒด: ์†๊ฐ€๋ฝ์ด ๋ฌผ์ฒด๋ฅผ ๊ฐ์‹ธ๊ธฐ ์–ด๋ ต๊ณ , ์ตœ์ ํ™” ์ง€ํ˜•(landscape)๋„ ๋” ํ‰ํƒ„ํ•ด์ง

BODex์˜ ์—๋„ˆ์ง€๋Š” ์ด ๊ฐ์†Œ ํญ์ด ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ์ž‘๋‹ค๋Š” ์ ๋„ ๊ฐ•์ ์ด๋‹ค. Allegro Hand๊ฐ€ Shadow Hand๋ณด๋‹ค ๋†’์€ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์ด๋Š” ๊ฒƒ๋„ ์ด ์ด์œ ๋‹ค โ€” Allegro๊ฐ€ Shadow๋ณด๋‹ค ํ›จ์”ฌ ํฌ๊ธฐ ๋•Œ๋ฌธ์— ์ƒ๋Œ€์ ์œผ๋กœ โ€œ์ž‘์€โ€ ๋ฌผ์ฒด๋ฅผ ์žก๋Š” ์ƒํ™ฉ์ด ๋งŽ์•„์ง„๋‹ค.

Ablation Study: ๊ฐ ๊ตฌ์„ฑ ์š”์†Œ์˜ ๊ธฐ์—ฌ

์–ด๋–ค ๋ถ€๋ถ„์ด ์–ผ๋งˆ๋‚˜ ์ค‘์š”ํ•œ์ง€ ์ œ๊ฑฐ ์‹คํ—˜(ablation)์œผ๋กœ ๊ฒ€์ฆํ–ˆ๋‹ค.

Table 3: Ablation Study ๊ฒฐ๊ณผ ์š”์•ฝ
๊ตฌ์„ฑ ์š”์†Œ SSR SPD CD
๊ธฐ๋ณธ ์ตœ์ ํ™”๋งŒ ๊ธฐ์ค€ ๊ธฐ์ค€ ๊ธฐ์ค€
+ Pre-grasp โ†‘ โ†‘ โ†‘
+ Coarse-to-Fine โ†‘โ†‘ โ†‘โ†‘ โ†‘โ†‘
+ ๋‘ ์ „๋žต ๋ชจ๋‘ ์ตœ๊ณ  ์ตœ๊ณ  ์ตœ๊ณ 

Coarse-to-Fine ์ „๋žต์€ ํ’ˆ์งˆ์„ ํฌ๊ฒŒ ๋†’์ด์ง€๋งŒ ์†๋„๋ฅผ ์ผ๋ถ€ ํฌ์ƒํ•œ๋‹ค. Pre-grasp ์ƒ์„ฑ์€ ์‹ค์ œ ์‹คํ–‰์— ํ•„์ˆ˜์ ์ด๋‹ค.

ํ•™์Šต ๋ชจ๋ธ ํ’ˆ์งˆ: BODex ๋ฐ์ดํ„ฐ์…‹ vs DexGraspNet

๋ฐ์ดํ„ฐ์…‹์˜ ์ง„์ •ํ•œ ๊ฐ€์น˜๋Š” ์ด๊ฒƒ์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ด ์–ผ๋งˆ๋‚˜ ์ž˜ ์ž‘๋™ํ•˜๋А๋ƒ์— ์žˆ๋‹ค.

xychart-beta
    title "ํ•™์Šต ๋ชจ๋ธ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๊ณต๋ฅ  (%)"
    x-axis ["DexGraspNet์œผ๋กœ ํ•™์Šต", "BODex๋กœ ํ•™์Šต"]
    y-axis "์„ฑ๊ณต๋ฅ  (%)" 0 --> 100
    bar [40, 80]

๋ฐ์ดํ„ฐ์…‹์— ๋”ฐ๋ฅธ ํ•™์Šต ๋ชจ๋ธ ์„ฑ๊ณต๋ฅ  ๋น„๊ต

DexGraspNet์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์˜ ์„ฑ๊ณต๋ฅ : ~40%
BODex ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์˜ ์„ฑ๊ณต๋ฅ : ~80%

๋‘ ๋ฐฐ ๊ฐ€๊นŒ์šด ํ–ฅ์ƒ. ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ Shadow Hand๋กœ 20๊ฐœ ๋‹ค์–‘ํ•œ ๋ฌผ์ฒด์— ๋Œ€ํ•ด ์‹คํ—˜ํ•œ ๊ฒฐ๊ณผ: 81% ์„ฑ๊ณต๋ฅ ์„ ๋‹ฌ์„ฑํ–ˆ๋‹ค. ์ด๋Š” ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์ด ํ•™์Šต ๊ฒฐ๊ณผ์— ์–ผ๋งˆ๋‚˜ ์ง๊ฒฐ๋˜๋Š”์ง€๋ฅผ ๋ช…ํ™•ํžˆ ๋ณด์—ฌ์ค€๋‹ค.


๊ด€๋ จ ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต

BODex๋ฅผ ๋‹ค๋ฅธ ์—ฐ๊ตฌ๋“ค๊ณผ ๋งฅ๋ฝ ์†์—์„œ ์œ„์น˜์‹œ์ผœ ๋ณด์ž.

graph TD
    A["๊ณ ์ „ ์ตœ์ ํ™” ๊ณ„์—ด"] --> B["GraspIt!\n(์ถฉ๋Œ ๊ฒ€์‚ฌ + ์ƒ˜ํ”Œ๋ง)"]
    A --> C["Q1 metric\nGrasp Wrench Space"]

    D["๊ทธ๋ž˜๋””์–ธํŠธ ๊ธฐ๋ฐ˜ ๊ณ„์—ด"] --> E["DFC\n(๊ท ๋“ฑ ํž˜ ๊ฐ€์ •)"]
    D --> F["TDG\n(๋งˆ์ฐฐ ๋ฌด์‹œ)"]
    D --> G["FRoGGeR\n(Differentiable FC)"]
    D --> H["SpringGrasp\n(Compliant ์ตœ์ ํ™”)"]
    D --> BODex["BODex โ˜…\n(์ด์ค‘ ์ตœ์ ํ™” + GPU QP)"]

    I["ํ™•์‚ฐ ๋ชจ๋ธ ๊ณ„์—ด"] --> J["DexDiffuser"]
    I --> K["UniDexGrasp++"]

    BODex --> L["DexGraspNet 2.0์—์„œ ํ™œ์šฉ"]
    BODex --> M["GraspVLA์—์„œ ํ™œ์šฉ"]
    BODex --> N["Dexonomy์—์„œ ๋น„๊ต ๊ธฐ์ค€"]

๋ฑ์Šคํ…Œ๋Ÿฌ์Šค ํŒŒ์ง€ ํ•ฉ์„ฑ ๋ฐฉ๋ฒ• ๊ณ„๋ณด

DexGraspNet (Wan et al., 2023)

Shadow Hand์— ๋Œ€ํ•ด 5,355๊ฐœ ๋ฌผ์ฒด, 130๋งŒ ๊ฑด์˜ ํŒŒ์ง€๋ฅผ ์ƒ์„ฑํ•œ ๋‹น์‹œ ์ตœ๋Œ€ ๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹. ์—๋„ˆ์ง€ ํ•จ์ˆ˜๋กœ TDG(์ ‘์„ ๋ ฅยท๋น„ํ‹€๋ฆผ ๋งˆ์ฐฐ ๋ฌด์‹œ)๋ฅผ ์‚ฌ์šฉ. BODex ๋ฐ์ดํ„ฐ์…‹์€ ๋ฌผ์ฒด ์ˆ˜๋Š” ์ ์ง€๋งŒ(2,440๊ฐœ) ํŒŒ์ง€ ํ’ˆ์งˆ์ด ํ›จ์”ฌ ๋†’๊ณ , ์ด๋ฅผ ํ•™์Šตํ•œ ๋ชจ๋ธ์˜ ์„ฑ๊ณต๋ฅ ์ด 40%์—์„œ 80%๋กœ ๋‘ ๋ฐฐ ํ–ฅ์ƒ.

FRoGGeR (Liu et al., 2023)

๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ force closure ์ง€ํ‘œ๋ฅผ ๋„์ž…ํ–ˆ์ง€๋งŒ, ์ƒํ˜ธ์ž‘์šฉ ๋ Œ์น˜ ๊ณ„์ˆ˜์˜ ํ•ฉ์ด 1์ด ๋˜๋„๋ก ์ œ์•ฝ์„ ๊ฑธ์–ด ๊ทธ๋ž˜๋””์–ธํŠธ ์†Œ์‹ค์„ ๋ฐฉ์ง€. BODex๋Š” ๋ฒ•์„ ๋ ฅ์˜ ํ•ฉ์ด ํŠน์ • ์ž„๊ณ„๊ฐ’ \gamma ์ด์ƒ์ด ๋˜๋„๋ก ์ œ์•ฝํ•˜๋Š” ๋‹ค๋ฅธ ์ ‘๊ทผ์„ ์ทจํ•œ๋‹ค.

SpringGrasp (Chen et al., 2024)

ํƒ„์„ฑ ์žˆ๋Š” ์†๊ฐ€๋ฝ์„ ๊ณ ๋ คํ•œ ์ˆœ์‘ํ˜•(compliant) ํŒŒ์ง€ ์ตœ์ ํ™”. ์ ‘๊ทผ ๋ฐฉํ–ฅ์ด ๋‹ค๋ฅด์ง€๋งŒ BODex์™€ ๊ฐ™์ด ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํ˜„์‹ค์ ์ธ ํŒŒ์ง€๋ฅผ ์ถ”๊ตฌํ•œ๋‹ค๋Š” ๊ณตํ†ต์ ์ด ์žˆ๋‹ค.

GraspQP (2025, ๋™์‹œ๋Œ€ ์—ฐ๊ตฌ)

BODex์™€ ๊ฑฐ์˜ ๋™์ผํ•œ ์•„์ด๋””์–ด์ธ QP ๊ธฐ๋ฐ˜ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ์—๋„ˆ์ง€๋ฅผ ๋…๋ฆฝ์ ์œผ๋กœ ์ œ์•ˆ. MALA*(Metropolis-adjusted Langevin Algorithm) ๋ณ€ํ˜•์„ ์ด์šฉํ•ด ๋‹ค์–‘์„ฑ์— ๋” ์ดˆ์ . ๋‘ ์—ฐ๊ตฌ๊ฐ€ ๋น„์Šทํ•œ ์‹œ๊ธฐ์— ๊ฐ™์€ ๋ฐฉํ–ฅ์„ ๋…๋ฆฝ์ ์œผ๋กœ ๋ฐœ๊ฒฌํ–ˆ๋‹ค๋Š” ๊ฒƒ ์ž์ฒด๊ฐ€ ์ด ์ ‘๊ทผ์˜ ํƒ€๋‹น์„ฑ์„ ์—ญ์„ค์ ์œผ๋กœ ์ฆ๋ช…ํ•œ๋‹ค.

Dexonomy (2025)

BODex๋ฅผ ๊ธฐ์ค€์ ์œผ๋กœ ์‚ผ์•„ ์—ฌ๊ธฐ์„œ ๋” ๋‚˜์•„๊ฐ€, ํŒŒ์ง€ ๋ถ„๋ฅ˜ํ•™(taxonomy)์— ๋”ฐ๋ฅธ ๋‹ค์–‘ํ•œ ํŒŒ์ง€ ์œ ํ˜•(power grasp, pinch, etc.)์„ ์ƒ์„ฑํ•œ๋‹ค. BODex ๋Œ€๋น„ ๋‹ค์–‘์„ฑ์—์„œ ์šฐ์œ„๋ฅผ ๋ณด์ด์ง€๋งŒ, BODex๊ฐ€ ๋” ๋†’์€ ์งˆ๋Ÿ‰ ๋“ฑ ์–ด๋ ค์šด ์กฐ๊ฑด์—์„œ๋„ ๊ฒฝ์Ÿ๋ ฅ ์žˆ์Œ์„ ํ™•์ธ.


๋น„ํŒ์  ๊ณ ์ฐฐ: ๊ฐ•์ ๊ณผ ํ•œ๊ณ„

๊ฐ•์ 

1. ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์˜ฌ๋ฐ”๋ฅธ ์—๋„ˆ์ง€ ์„ค๊ณ„

โ€œ์–ด๋–ค ๊ฐ€์ •๋„ ์—†๋Š”โ€ QP ๊ธฐ๋ฐ˜ ์—๋„ˆ์ง€. ๊ท ๋“ฑ ํž˜ ๊ฐ€์ •๋„, ๋งˆ์ฐฐ ๋ฌด์‹œ๋„ ์—†๋‹ค. ์ด๋Š” ์—๋„ˆ์ง€-์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๊ณต๋ฅ  ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ํฌ๊ฒŒ ๋†’์—ฌ์„œ, ์ƒ์„ฑ๋œ ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์„ ์‹ค์งˆ์ ์œผ๋กœ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค.

2. ์—”์ง€๋‹ˆ์–ด๋ง์˜ ์šฐ์ˆ˜์„ฑ

cuRobo + GPU ๋ฐฐ์น˜ QP๋ผ๋Š” ์กฐํ•ฉ์€ ๋‹จ์ˆœํžˆ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ์„ ํ•œ ๊ฒƒ์„ ๋„˜์–ด ์‹œ์Šคํ…œ ์ˆ˜์ค€์˜ ๊ธฐ์—ฌ๋‹ค. ๋‹จ์ผ 3090 GPU๋กœ 49+ grasps/sec, ํ•˜๋ฃจ ์ˆ˜๋ฐฑ๋งŒ ๊ฑด์ด๋ผ๋Š” ์ˆซ์ž๋Š” ์ด ์—ฐ๊ตฌ๊ฐ€ ๋ฐ์ดํ„ฐ ์ค‘์‹ฌ ๋ฑ์Šคํ…Œ๋Ÿฌ์Šค ํ•ธ๋“œ ์—ฐ๊ตฌ๋ฅผ ์‹ค์งˆ์ ์œผ๋กœ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ค์—ˆ๋‹ค๋Š” ๋œป์ด๋‹ค.

3. ๋‹ค์ค‘ ํ•ธ๋“œ ์ง€์›

Shadow, Allegro, LEAP Hand๋ฅผ ๋ชจ๋‘ ์ง€์›ํ•˜๋Š” ์ผ๋ฐ˜ํ™”๋œ ํŒŒ์ดํ”„๋ผ์ธ. ์„ค์ • ํŒŒ์ผ๋งŒ ๋ฐ”๊พธ๋ฉด ์ƒˆ๋กœ์šด ํ•ธ๋“œ์— ์ ์šฉ ๊ฐ€๋Šฅ.

4. ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ๋ฒค์น˜๋งˆํฌ

MuJoCo ๊ธฐ๋ฐ˜์˜ ํ‘œ์ค€ํ™”๋œ ๋ฒค์น˜๋งˆํฌ ์ œ๊ณต. ์ด ๋ถ„์•ผ์—์„œ ์˜ค๋žซ๋™์•ˆ ํ•„์š”ํ–ˆ์ง€๋งŒ ์—†์—ˆ๋˜ ๊ฒƒ์„ ์ฑ„์›Œ์ค€๋‹ค.

5. ํ›„์† ์—ฐ๊ตฌ์—์˜ ์˜ํ–ฅ๋ ฅ

๋ฐœํ‘œ ์งํ›„ DexGraspNet 2.0, GraspVLA ๋“ฑ์ด BODex ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ทธ๋Œ€๋กœ ํ™œ์šฉํ•˜๊ฑฐ๋‚˜ ์ˆ˜์ •ํ•ด์„œ ์“ด๋‹ค๋Š” ์‚ฌ์‹ค์€, ์ด ์—ฐ๊ตฌ๊ฐ€ ๋‹จ์ˆœํžˆ ์ข‹์€ ๋…ผ๋ฌธ์— ๊ทธ์น˜์ง€ ์•Š๊ณ  ์ธํ”„๋ผ ์ˆ˜์ค€์˜ ๊ณตํ—Œ์ž„์„ ๋ณด์—ฌ์ค€๋‹ค.

ํ•œ๊ณ„ ๋ฐ ํ–ฅํ›„ ๊ณผ์ œ

1. Palm Contact ๋ฏธ์‚ฌ์šฉ

ํ˜„์žฌ BODex๋Š” ์†๋(fingertip) ์ค‘์‹ฌ์˜ ํŒŒ์ง€๋งŒ ์ƒ์„ฑํ•œ๋‹ค. ์‹ค์ œ ์ธ๊ฐ„์˜ ํŒŒ์ง€์—์„œ ์†๋ฐ”๋‹ฅ(palm)์€ ์•ˆ์ •์„ฑ์— ๋งค์šฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๋Š”๋ฐ, ์ด๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š๋Š”๋‹ค. ํŠนํžˆ ํฐ ๋ฌผ์ฒด๋‚˜ heavy object๋ฅผ ์žก์„ ๋•Œ palm contact์˜ ๋ถ€์žฌ๋Š” ํ’ˆ์งˆ ์ €ํ•˜๋กœ ์ด์–ด์ง„๋‹ค.

2. Fingertip-only Grasps์˜ ๋‹ค์–‘์„ฑ ์ œํ•œ

๋…ผ๋ฌธ ์ž์ฒด๋„ ์ธ์ •ํ•˜๋“ฏ, ํ˜„์žฌ ์ƒ์„ฑ๋˜๋Š” ํŒŒ์ง€๋Š” ์ฃผ๋กœ โ€œpower grasp(๋ฌผ์ฒด๋ฅผ ๊ฐ์‹ธ๋Š” ๋ฐฉ์‹)โ€ ์œ„์ฃผ๋‹ค. ์ธ๊ฐ„์ด ํŽœ์„ ์ฅ๋Š” โ€œ์ •๋ฐ€ ํŒŒ์ง€(precision grasp)โ€, ์—ด์‡ ๋ฅผ ์žก๋Š” โ€œlateral pinchโ€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ํŒŒ์ง€ ์œ ํ˜•(grasp taxonomy)์„ ๋‹ค๋ฃจ์ง€ ๋ชปํ•œ๋‹ค. ์ด ํ•œ๊ณ„๋Š” Dexonomy ๊ฐ™์€ ํ›„์† ์—ฐ๊ตฌ๊ฐ€ ์ •ํ™•ํžˆ ํ•ด๊ฒฐํ•˜๋ ค๋Š” ๋ฌธ์ œ๋‹ค.

3. Floating Hand ๊ฐ€์ •

ํŒŒ์ง€ ํ•ฉ์„ฑ ์ž์ฒด๋Š” โ€œ๊ณต์ค‘์— ๋–  ์žˆ๋Š” ์†โ€์„ ๊ฐ€์ •ํ•œ๋‹ค. ํ…Œ์ด๋ธ”ํƒ‘ ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์œ„ํ•ด cuRobo๋กœ ๋ชจ์…˜ ํ”Œ๋ž˜๋‹์„ ์ถ”๊ฐ€ํ•˜์ง€๋งŒ, ๋” ๋ณต์žกํ•œ ํ™˜๊ฒฝ(์„ ๋ฐ˜, ์žฅ์• ๋ฌผ, ์ด์ค‘ํŒ” ๋“ฑ)์œผ๋กœ์˜ ํ™•์žฅ์€ ์ถ”๊ฐ€ ์ž‘์—…์ด ํ•„์š”ํ•˜๋‹ค.

4. ๋งˆ์ฐฐ ์›๋ฟ”์˜ ํ”ผ๋ผ๋ฏธ๋“œ ๊ทผ์‚ฌ

๋งˆ์ฐฐ ์›๋ฟ”์„ 8๊ผญ์ง“์  ํ”ผ๋ผ๋ฏธ๋“œ๋กœ ๊ทผ์‚ฌํ•˜๋ฉด ์†๋„๋Š” ๋น ๋ฅด์ง€๋งŒ ๋ฌผ๋ฆฌ์  ์ •๋ฐ€๋„์—์„œ ์ž‘์€ ์˜ค์ฐจ๊ฐ€ ์ƒ๊ธด๋‹ค. ์ด ๊ทผ์‚ฌ๊ฐ€ ์‹ค์ œ ์‹คํ—˜์—์„œ ์˜๋ฏธ ์žˆ๋Š” ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€์— ๋Œ€ํ•œ ์„ธ๋ฐ€ํ•œ ๋ถ„์„์€ ๋…ผ๋ฌธ์—์„œ ๋‹ค๋ฃจ์ง€ ์•Š๋Š”๋‹ค.

5. ๊ธฐ๋Šฅ์  ํŒŒ์ง€ (Functional Grasps)

โ€œ๋ฌผ๋ณ‘์„ ์ง‘์–ด์„œ ๋งˆ์‹ค ์ˆ˜ ์žˆ๊ฒŒโ€ ๊ฐ™์€ ํƒœ์Šคํฌ ์ง€ํ–ฅ์  ํŒŒ์ง€๋Š” ์ง€์›๋˜์ง€ ์•Š๋Š”๋‹ค. BODex๋Š” ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ์ •์ ์ธ ํŒŒ์ง€๋ฅผ ๋งŒ๋“ค์ง€๋งŒ, ์–ด๋–ค ๋ฐฉํ–ฅ์œผ๋กœ ์žก๋Š”๊ฒŒ ์‚ฌ์šฉ์— ์ ํ•ฉํ•œ์ง€๋Š” ๊ณ ๋ คํ•˜์ง€ ์•Š๋Š”๋‹ค.

6. ์‹ค์‹œ๊ฐ„ ํŒŒ์ง€ ์ƒ์„ฑ ๋ถˆ๊ฐ€

BODex๋Š” ์˜คํ”„๋ผ์ธ ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ ์‹œ์Šคํ…œ์ด๋‹ค. 49+ grasps/sec๋Š” ์ธ์ƒ์ ์ด์ง€๋งŒ, ๋กœ๋ด‡์ด ์ƒˆ๋กœ์šด ๋ฌผ์ฒด๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์žก์œผ๋ ค ํ•  ๋•Œ ์˜จ๋ผ์ธ์œผ๋กœ ํŒŒ์ง€๋ฅผ ํ•ฉ์„ฑํ•˜๋Š” ๊ฒƒ์€ ์—ฌ์ „ํžˆ ์–ด๋ ต๋‹ค. ์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๊ณผ์˜ ๊ฒฐํ•ฉ์ด ํ•„์ˆ˜๋‹ค.


์•Œ๋ ˆ๊ทธ๋กœ ํ•ธ๋“œ ์—ฐ๊ตฌ์ž๋ฅผ ์œ„ํ•œ ๋…ธํŠธ

BODex ๋ฐ์ดํ„ฐ์…‹์—๋Š” Allegro Hand์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค(HuggingFace์—์„œ allegro.tar.gz๋กœ ์ œ๊ณต). ์•Œ๋ ˆ๊ทธ๋กœ ํ•ธ๋“œ ์—ฐ๊ตฌ์ž๋ผ๋ฉด ๋‹ค์Œ ์‚ฌํ•ญ์— ์ฃผ๋ชฉํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค:

1. Allegro Hand์˜ ํŠน์„ฑ์ƒ ์„ฑ๊ณต๋ฅ ์ด Shadow๋ณด๋‹ค ๋†’๋‹ค
์•ž์„œ ์„ค๋ช…ํ–ˆ๋“ฏ, Allegro๊ฐ€ Shadow๋ณด๋‹ค ํฌ๊ธฐ ๋•Œ๋ฌธ์— ์ƒ๋Œ€์ ์œผ๋กœ ์ž‘์€ ๋ฌผ์ฒด๋ฅผ ์žก๋Š” ์ƒํ™ฉ์— ์œ ๋ฆฌํ•˜๋‹ค. ์‹ค์ œ๋กœ ๋…ผ๋ฌธ์—์„œ๋„ Allegro์˜ SSR์ด Shadow๋ณด๋‹ค ๋†’๊ฒŒ ๋‚˜์˜จ๋‹ค.

2. ๋ฐ์ดํ„ฐ์…‹ ํ™œ์šฉ: DexLearn
Allegro, LEAP, Shadow์˜ ํŒŒ์ง€ ๋ฐ์ดํ„ฐ๋Š” DexLearn(ํ•™์Šต ์ฝ”๋“œ)์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์—ˆ๋‹ค. ๋„คํŠธ์›Œํฌ ์ž…๋ ฅ์œผ๋กœ ์‹ฑ๊ธ€-๋ทฐ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ(single-view point cloud)๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

3. VLA ํŒŒ์ดํ”„๋ผ์ธ๊ณผ์˜ ์—ฐ๊ณ„
BODex ํŒŒ์ง€ ๋ฐ์ดํ„ฐ โ†’ ํ•™์Šต ๊ธฐ๋ฐ˜ ์ •์ฑ…(DexLearn) โ†’ VLA ๋ชจ๋ธ(์˜ˆ: GraspVLA) ์ด๋Ÿฐ ๊ณ„์ธต์  ์—ฐ๊ณ„๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค. BODex๋Š” GraspVLA์—์„œ๋„ ๊ธฐ๋ฐ˜ ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ํ™œ์šฉ๋˜์—ˆ๋‹ค.

4. ๊ฐ•ํ™”ํ•™์Šต ์ดˆ๊ธฐํ™”๋กœ์„œ์˜ ํ™œ์šฉ
BODex๊ฐ€ ์ƒ์„ฑํ•œ ํŒŒ์ง€ ํฌ์ฆˆ๋Š” RL ๊ธฐ๋ฐ˜ ํŒŒ์ง€ ์ •์ฑ…์˜ ์‹œ์ž‘์ (initialization)์œผ๋กœ๋„ ์“ธ ์ˆ˜ ์žˆ๋‹ค. ๋ฌด์ž‘์œ„ ์ดˆ๊ธฐํ™” ๋Œ€์‹  BODex ํŒŒ์ง€ ํฌ์ฆˆ์—์„œ RL์„ ์‹œ์ž‘ํ•˜๋ฉด ํƒ์ƒ‰ ํšจ์œจ์ด ํฌ๊ฒŒ ๋†’์•„์ง„๋‹ค.


์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

BODex์˜ ๊ธฐ์—ฌ๋ฅผ ํ•œ ์ค„๋กœ ์š”์•ฝํ•˜๋ฉด: โ€œ๋ฑ์Šคํ…Œ๋Ÿฌ์Šค ํŒŒ์ง€ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ์˜ ์†๋„-ํ’ˆ์งˆ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„๋ฅผ ๋™์‹œ์— ํ•ด๊ฒฐํ•œ ์ฒซ ๋ฒˆ์งธ ์‹ค์šฉ์  ์‹œ์Šคํ…œโ€์ด๋‹ค.

์ด์ค‘ ์ตœ์ ํ™”๋ผ๋Š” ์ˆ˜ํ•™์  ์šฐ์•„ํ•จ๊ณผ GPU ๋ณ‘๋ ฌํ™”๋ผ๋Š” ์—”์ง€๋‹ˆ์–ด๋ง ํƒ์›”ํ•จ์ด ๋งŒ๋‚ฌ์„ ๋•Œ ๋ฌด์Šจ ์ผ์ด ์ƒ๊ธฐ๋Š”์ง€ BODex๊ฐ€ ๋ณด์—ฌ์ค€๋‹ค. ์—๋„ˆ์ง€ ์„ค๊ณ„์˜ ๋ฌผ๋ฆฌ์  ํƒ€๋‹น์„ฑ์ด ๋†’์•„์ง€์ž ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์ด ์˜ค๋ฅด๊ณ , ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์ด ์˜ค๋ฅด์ž ํ•™์Šต ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด 40% โ†’ 80%๋กœ ๋‘ ๋ฐฐ๊ฐ€ ๋๋‹ค. ๋ฐ์ดํ„ฐ๊ฐ€ ์ข‹์œผ๋ฉด ๋ชจ๋ธ์ด ์ข‹์•„์ง„๋‹ค๋Š” ๋‹น์—ฐํ•œ ์‚ฌ์‹ค์„, BODex๋Š” ์ •๋Ÿ‰์ ์œผ๋กœ ์ฆ๋ช…ํ•ด๋ƒˆ๋‹ค.

์ด ๋…ผ๋ฌธ์ด ์ค‘์š”ํ•œ ์„ธ ๊ฐ€์ง€ ์ด์œ :

  1. ๋ฐ์ดํ„ฐ์…‹ ๊ณตํ—Œ: ์ˆ˜๋ฐฑ๋งŒ ๊ฑด, MuJoCo ๊ฒ€์ฆ ์™„๋ฃŒ, ๋‹ค์ค‘ ํ•ธ๋“œ ์ง€์› โ€” ์ด ๋ถ„์•ผ์˜ ์‚ฌ์‹ค์ƒ ํ‘œ์ค€ ๋ฐ์ดํ„ฐ์…‹์ด ๋˜์–ด๊ฐ€๊ณ  ์žˆ๋‹ค

  2. ๋ฐฉ๋ฒ•๋ก  ๊ณตํ—Œ: ์ด์ค‘ ์ตœ์ ํ™”๋ฅผ ํ†ตํ•œ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์˜ฌ๋ฐ”๋ฅธ ์—๋„ˆ์ง€ ์„ค๊ณ„ โ€” ๋‹จ์ˆœํžˆ ๋น ๋ฅธ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋น ๋ฅธ ๊ฒƒ

  3. ์ธํ”„๋ผ ๊ณตํ—Œ: ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ MuJoCo ๋ฒค์น˜๋งˆํฌ โ€” ์ด์ œ ์šฐ๋ฆฌ๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ๊ฐ™์€ ์ž๋กœ ์žด ์ˆ˜ ์žˆ๋‹ค

ํ•œ๊ณ„๋„ ๋ถ„๋ช…ํ•˜๋‹ค. Palm contact ์—†์Œ, ํŒŒ์ง€ ๋‹ค์–‘์„ฑ ์ œํ•œ, ๊ธฐ๋Šฅ์  ํŒŒ์ง€ ๋ฏธ์ง€์›. ํ•˜์ง€๋งŒ ์ด๊ฒƒ๋“ค์€ ๋‹ค์Œ ๋‹จ๊ณ„๊ฐ€ ๋ฌด์—‡์ธ์ง€๋ฅผ ๋ช…ํ™•ํžˆ ๊ฐ€๋ฆฌํ‚ค๋Š” ์—ด๋ฆฐ ๋ฌธ์ œ๋“ค์ด๋‹ค. Dexonomy, GraspVLA, DexGraspNet 2.0์ด ์ด๋ฏธ ๊ทธ ๊ธธ์„ ๊ฑท๊ธฐ ์‹œ์ž‘ํ–ˆ๊ณ , BODex๋Š” ๊ทธ๋“ค ๋ชจ๋‘์˜ ์ถœ๋ฐœ์ ์ด ๋˜์—ˆ๋‹ค.

๋ฑ์Šคํ…Œ๋Ÿฌ์Šค ํ•ธ๋“œ ์—ฐ๊ตฌ๋ฅผ ํ•˜๋Š” ์‚ฌ๋žŒ์ด๋ผ๋ฉด, BODex๋ฅผ ๋ชจ๋ฅด๊ณ  2025๋…„์„ ์ง€๋‚˜๊ฐ€๋Š” ๊ฑด ์–ด๋ ต๋‹ค.


์ฐธ๊ณ  ์ž๋ฃŒ

  • ํ”„๋กœ์ ํŠธ ํŽ˜์ด์ง€: https://pku-epic.github.io/BODex
  • GitHub: https://github.com/JYChen18/BODex
  • ๋ฐ์ดํ„ฐ์…‹ (HuggingFace): https://huggingface.co/datasets/JiayiChenPKU/BODex
  • ๊ด€๋ จ ๋…ผ๋ฌธ:
    • DexGraspNet (Wan et al., 2023): ๋Œ€๊ทœ๋ชจ ํŒŒ์ง€ ๋ฐ์ดํ„ฐ์…‹์˜ ์„ ๊ตฌ์ž
    • FRoGGeR (Liu et al., 2023): ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ force closure ์ ‘๊ทผ
    • GraspQP (2025): ๋™์‹œ๋Œ€ ๋…๋ฆฝ ์ œ์•ˆ, MALA* ๊ธฐ๋ฐ˜
    • Dexonomy (2025): ํŒŒ์ง€ ๋ถ„๋ฅ˜ํ•™ ๊ธฐ๋ฐ˜ ๋‹ค์–‘์„ฑ ํ™•์žฅ
    • GraspVLA (2025): BODex ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ VLA ์—ฐ๊ตฌ
    • cuRobo (Sundaralingam et al., 2023): BODex์˜ ํ•ต์‹ฌ ์šด๋™ํ•™ ์—”์ง„

Copyright 2026, JungYeon Lee