Curieux.JY
  • JungYeon Lee
  • Post
  • Lecture
  • Note

On this page

  • ๐Ÿ” Ping Review
  • ๐Ÿ”” Ring Review
    • ์„œ๋ก 
    • ๋ฐฉ๋ฒ•
      • 1๋‹จ๊ณ„: ์†-๋ฌผ์ฒด ์ ‘์ด‰ ๋ชจ๋ธ๋ง (Hand-Object Contact Modelling)
      • 2๋‹จ๊ณ„: ์ ‘์ด‰ ์•ต์ปค ๊ตฌ์„ฑ๊ณผ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ Force Closure
      • 3๋‹จ๊ณ„: ์šด๋™ํ•™ ์ตœ์ ํ™” (Kinematics Optimisation)
    • ์‹คํ—˜
      • ์†-๋ฌผ์ฒด ๋ณต์›
      • ์šด๋™ํ•™ ๋ฆฌํƒ€๊ฒŒํŒ…
      • ์‹œ๋ฎฌ๋ ˆ์ด์…˜
    • ๋น„ํŒ์  ๊ณ ์ฐฐ
    • ์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

๐Ÿ“ƒGenHand

grasp
retargeting
teleoperation
GenHand: Generalised Human Grasp Kinematic Retargeting
Published

June 2, 2026

  • Paper Link
  1. ๐Ÿค– GenHand๋Š” ์‚ฌ๋žŒ ์†๊ณผ ๋กœ๋ด‡ ๊ทธ๋ฆฌํผ ๊ฐ„์˜ ํ˜•ํƒœํ•™์  ์ฐจ์ด๋กœ ์ธํ•ด ๋ฐœ์ƒํ•˜๋Š” teleoperation ๋ฐ imitation learning์—์„œ์˜ kinematic retargeting ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆ๋œ ์ƒˆ๋กœ์šด object-oriented ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
  2. ๐Ÿฆพ ์ด ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์žฌ๊ตฌ์„ฑ๋œ hand-object geometry์—์„œ contact region์„ ์ถ”์ถœํ•˜๊ณ , force closure optimization์„ ํ†ตํ•ด ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ์ •์ ์ธ contact anchor๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ, kinematics optimization์œผ๋กœ ๊ทธ๋ฆฌํผ ๊ตฌ์„ฑ์„ ์ตœ์ ํ™”ํ•˜์—ฌ human-like grasp๋ฅผ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
  3. ๐Ÿ“Š extensive evaluation ๊ฒฐ๊ณผ, GenHand๋Š” ๋‹ค์–‘ํ•œ ๊ทธ๋ฆฌํผ์™€ ๊ฐ์ฒด์— ๋Œ€ํ•ด ๊ธฐ์กด baseline ๋Œ€๋น„ net wrench residual๊ณผ surface contact consistency์—์„œ ํฌ๊ฒŒ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ, simulation success rate๋„ 39.8% ๋” ๋†’์•˜์Šต๋‹ˆ๋‹ค.

๐Ÿ” Ping Review

๐Ÿ” Ping โ€” A light tap on the surface. Get the gist in seconds.

์ด ๋…ผ๋ฌธ์€ ์‚ฌ๋žŒ์˜ ์ œ์Šค์ฒ˜๋ฅผ ๋กœ๋ด‡ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ๋™์ž‘์œผ๋กœ, ํŠนํžˆ ๊ฒฌ๊ณ ํ•˜๊ณ  ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ grasping์ด ํ•„์š”ํ•œ ๊ฐ์ฒด ์กฐ์ž‘ ์ž‘์—…์— ํšจ๊ณผ์ ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐ ํ•„์ˆ˜์ ์ธ kinematic retargeting ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์€ ์ฃผ๋กœ ์†์˜ ํ˜•ํƒœ, ํŠนํžˆ ์†๊ฐ€๋ฝ ๋ ์œ„์น˜๋ฅผ ๋ณต์ œํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘์—ˆ์ง€๋งŒ, ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ grippers์— ์ ์‘ํ•  ๋•Œ grasp ํ’ˆ์งˆ์ด ์ €ํ•˜๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•˜์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์€ force closure์™€ kinematic similarity๋ฅผ ์ตœ์ ํ™”ํ•˜์—ฌ ๋‹ค์–‘ํ•œ grippers์— ๋Œ€ํ•ด ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ grasps๋ฅผ ์ƒ์„ฑํ•˜๋Š” kinematic retargeting ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ GenHand๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

๊ธฐ์กด ๋ฐฉ๋ฒ• ๋ฐ ํ•œ๊ณ„:

Kinematic retargeting ๋ฐฉ๋ฒ•์€ ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€๋กœ ๋ถ„๋ฅ˜๋ฉ๋‹ˆ๋‹ค:

  1. Hand-oriented (Key-vector based) approaches: ์ธ๊ฐ„์˜ ์ œ์Šค์ฒ˜๋ฅผ ์ง์ ‘ captureํ•˜์—ฌ ๋กœ๋ด‡์— mappingํ•ฉ๋‹ˆ๋‹ค. ์ฃผ๋กœ ์ธ๊ฐ„๊ณผ ๋กœ๋ด‡ ์†์˜ ์†๊ฐ€๋ฝ ๋ ์‚ฌ์ด์˜ ๊ณต๊ฐ„์  ๊ฑฐ๋ฆฌ๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” kinematics optimization์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ธ๊ฐ„ ์†๊ณผ ๋กœ๋ด‡ gripper ์‚ฌ์ด์˜ kinematic similarity์— ์ œ์•ฝ์„ ๋ฐ›์œผ๋ฉฐ, gripper๊ฐ€ ์ถฉ๋ถ„ํ•œ contact region์„ ์ œ๊ณตํ•˜์ง€ ๋ชปํ•  ๊ฒฝ์šฐ retargeting์˜ ์•ˆ์ •์„ฑ์„ ๋ณด์žฅํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ๊ฐ์ฒด ํ˜•์ƒ์„ ๊ณ ๋ คํ•˜์ง€ ์•Š์•„ contact consistency๋‚˜ stability analysis๋ฅผ ๋ณด์žฅํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.
  2. Object-oriented approaches: ๊ฐ์ฒด ํ‘œ๋ฉด์˜ contact region์„ ๋ชจ๋ธ๋งํ•˜๊ณ  ๋กœ๋ด‡ gripper๊ฐ€ ํ•ด๋‹น ์˜์—ญ์„ ๋ฎ๋„๋ก mappingํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ฐ์ฒด ํ˜•์ƒ์— ๋Œ€ํ•œ ๋” ๋‚˜์€ ์ ์‘์„ฑ์„ ์ œ๊ณตํ•˜์ง€๋งŒ, ์†๊ฐ€๋ฝ ์ˆ˜๊ฐ€ ์ ๊ฑฐ๋‚˜ dexterous ๋Šฅ๋ ฅ์ด ์ œํ•œ์ ์ธ ๋กœ๋ด‡ gripper์— retargetingํ•  ๋•Œ grasp ํ’ˆ์งˆ๊ณผ ์•ˆ์ •์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์„ ๊ฒช์Šต๋‹ˆ๋‹ค.

GenHand์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด:

GenHand๋Š” kinematic retargeting๊ณผ grasp synthesis๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ์•ˆ์ •์„ฑ์„ ๋ช…์‹œ์ ์œผ๋กœ ํ™•๋ณดํ•˜๋Š” ์ƒˆ๋กœ์šด object-oriented ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์žฌ๊ตฌ์„ฑ๋œ hand-object geometry๋กœ๋ถ€ํ„ฐ ํ›„๋ณด contact regions๋ฅผ ์ถ”์ถœํ•˜๊ณ , ๋น„์ง€๋„ ํ•™์Šต(unsupervised) ๊ธฐ๋ฐ˜์˜ clustering ๋ถ„์„์„ ํ†ตํ•ด ์ธ๊ฐ„์˜ grasps๋ฅผ ๊ตฌ์กฐํ™”๋œ sub-representations๋กœ ์ถ”์ƒํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ด ์ถ”์ƒํ™”๋Š” dexterity๊ฐ€ ์ œํ•œ๋œ gripper์— ๋Œ€ํ•œ ํ•ฉ๋ฆฌ์ ์ธ ์ ์‘์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๋‹ค์Œ, differentiable force-closure optimization์„ ์ˆ˜ํ–‰ํ•˜์—ฌ sub-representations๋ฅผ ๋งˆ์ฐฐ ์ œ์•ฝ ํ•˜์—์„œ ํ‰ํ˜• ์ƒํƒœ(equilibrium-feasible configurations)๋กœ ์œ ๋„ํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, collision ๋ฐ joint-limit ์ œ์•ฝ ํ•˜์—์„œ ๋กœ๋ด‡ ๋งํฌ์™€ target contact ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” kinematics optimization์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด optimization์€ in-loop linear-assignment and iterative closest point (LA-ICP) ๊ธฐ๋ฐ˜์˜ ๋‹จ๊ณ„๋ฅผ ํฌํ•จํ•˜์—ฌ ๋กœ๋ด‡๊ณผ target contact ๊ฐ„์˜ correspondence๋ฅผ ๋™์ ์œผ๋กœ matchingํ•จ์œผ๋กœ์จ ํ•ฉ๋ฒ•์ ์ด๊ณ  ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ grasp configuration์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.

GenHand์˜ ๋ฐฉ๋ฒ•๋ก  (๊ธฐ์ˆ ์  ์„ธ๋ถ€ ์‚ฌํ•ญ):

  1. Hand-Object Contact Modelling:
    • ์ž…๋ ฅ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๊ฐ์ฒด์™€ ์ธ๊ฐ„ ์†์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์žฌ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
    • DeepSDF๋ฅผ ๊ฐ์ฒด ํ‘œํ˜„์—, MANO๋ฅผ ์ธ๊ฐ„ ์† ํ‘œํ˜„์— ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
    • AlignSDF์˜ ์„ค๊ณ„๋ฅผ ๋”ฐ๋ผ ๋‘ ๊ฐœ์˜ branch๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค:
      • Hand-branch decoder: MANO pose parameter \theta_h์™€ shape parameter \beta_h๋ฅผ ์˜ˆ์ธกํ•˜๊ณ , MANO layer๋ฅผ ํ†ตํ•ด ์†์˜ joint ์œ„์น˜์™€ surface mesh๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
      • Object-branch decoder: point samples๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ํ•ด๋‹น SDF ๊ฐ’์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๊ฐ์ฒด์˜ ์ค‘์‹ฌ(center)๋„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
    • ์†๊ณผ ๊ฐ์ฒด ๋ชจ๋ธ๋ง์„ ์œ„ํ•œ Loss Function์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค: L_{HO}(\mathbf{p}, \mathbf{g}) = \lambda_{sdf_o} | \Delta sdf_o - sdf_o | + \lambda_t || \Delta t_o - t_o ||^2 + \lambda_j || \Delta j_h - j_h ||^2 + \lambda_h || (\theta_h, \beta_h) ||^2 ์—ฌ๊ธฐ์„œ \mathbf{p}๋Š” ์˜ˆ์ธก๊ฐ’, \mathbf{g}๋Š” ground truth ๊ฐ’์ž…๋‹ˆ๋‹ค.
    • ResNet-18 ์ธ์ฝ”๋”๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ multiscale visual features๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  2. Contact Anchor Generation:
    • ์ธ๊ฐ„์˜ grasp ํŒจํ„ด์„ ๋ถ„์„ํ•˜๊ณ  ๋กœ๋ด‡ gripper ๊ตฌ์กฐ์— ๋งž๋Š” ์ƒˆ๋กœ์šด grasp structure๋กœ ์žฌ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
    • Valid Human Contact Point ์ถ”์ถœ: MANO ๋ชจ๋ธ๋กœ ์ƒ์„ฑ๋œ ์ธ๊ฐ„ ์† mesh๋กœ๋ถ€ํ„ฐ ์†๊ฐ€๋ฝ ๋(fingertip) contact points์™€ ํ•ด๋‹น surface normals๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค. SDF ๊ฐ’์„ ํ†ตํ•ด ๊ฐ์ฒด ํ‘œ๋ฉด๊ณผ์˜ ๊ทผ์ ‘๋„๋ฅผ ํ™•์ธํ•˜๊ณ , contact points๊ฐ€ ๊ฐ์ฒด ํ‘œ๋ฉด์˜ ํŠน์ • ๊ฑฐ๋ฆฌ ๋‚ด์— ์žˆ๊ณ  hand contact normal๊ณผ object surface normal์ด antipodal ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ๊ฒฝ์šฐ ์œ ํšจํ•œ contact point๋กœ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
    • Contact Information ๋ถ„์„ (HDBSCAN Clustering):
      • Contact normals์— ๋Œ€ํ•œ clustering: grasp ๋‚ด์˜ ์ง€๋ฐฐ์ ์ธ force directions๋ฅผ ์‹๋ณ„ํ•ฉ๋‹ˆ๋‹ค. centroid๋Š” ํ•ด๋‹น contact positions์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.
      • Contact positions์— ๋Œ€ํ•œ clustering: ์ธ๊ฐ„ grasp์˜ ๊ตฌ์กฐ์  ๋ฐฐ์น˜์™€ ๊ธฐํ•˜ํ•™์  ๋ฐฐ์—ด์„ captureํ•ฉ๋‹ˆ๋‹ค.
    • Robot Gripper์— Contact Anchor ํ• ๋‹น: gripper์˜ kinematic configuration์„ ๊ณ ๋ คํ•˜์—ฌ ๊ณ„์ธต์  ์ „๋žต์œผ๋กœ ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค.
      • ๊ฐ€์žฅ ๋šœ๋ ทํ•œ force components (์ผ๋ฐ˜์ ์œผ๋กœ antipodal grasping points)๋ฅผ ์šฐ์„ ์ ์œผ๋กœ ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค.
      • Dexterous gripper์˜ ๊ฒฝ์šฐ, normal-based clustering์œผ๋กœ ์‹๋ณ„๋œ ๋ชจ๋“  primary force components๋ฅผ ๋‹ค๋ฃฐ ๋•Œ๊นŒ์ง€ contact anchors๋ฅผ ๊ณ„์† ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค.
      • ์ถ”๊ฐ€์ ์ธ contact capacity๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ, position-based clustering์˜ centroids๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ grasp๋ฅผ ์ •๊ตํ™”ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด๋Ÿฌํ•œ ๊ณผ์ •์€ ๋กœ๋ด‡ gripper์˜ ์ดˆ๋ฐ˜ contact region์ธ \mathbf{x}_h๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, ๋‹ค์Œ ๋‹จ๊ณ„์˜ force-closure optimization์„ ์œ„ํ•œ ์ž…๋ ฅ์ด ๋ฉ๋‹ˆ๋‹ค. ์ธ๊ฐ„์˜ grasping๊ณผ ์œ ์‚ฌ์„ฑ์„ ์ถ”๊ตฌํ•˜๊ธฐ ์œ„ํ•ด ์žฌ๊ตฌ์„ฑ๋œ ์ธ๊ฐ„ contact anchor \mathbf{x}_h ๊ทผ๋ฐฉ์— ๋กœ๋ด‡ grasping position \mathbf{x}๊ฐ€ ์œ„์น˜ํ•˜๋„๋ก ์ถ”๊ฐ€์ ์ธ penalty term์„ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค: L_d(\mathbf{x}, \mathbf{x}_h) = \text{ReLU}(||\mathbf{x} - \mathbf{x}_h||^2 - \epsilon)
  3. Differentiable Force Closure Optimization:
    • Force closure๋Š” ๊ฐ์ฒด์— ๊ฐ€ํ•ด์ง€๋Š” ๋ชจ๋“  ์™ธ๋ถ€ wrenches๋ฅผ n๊ฐœ์˜ contact points \mathbf{x}^n \in \mathbb{R}^3๊ฐ€ ๊ท ํ˜•์„ ๋งž์ถœ ์ˆ˜ ์žˆ์Œ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
    • ์„ ํ˜•ํ™”๋œ ๋งˆ์ฐฐ ํ”ผ๋ผ๋ฏธ๋“œ(linearised frictional pyramid) ์ œ์•ฝ ์กฐ๊ฑด ํ•˜์—์„œ force closure๋ฅผ ๋‹ค์Œ ์ตœ์ ํ™” ๋ฌธ์ œ๋กœ ๊ณต์‹ํ™”ํ•ฉ๋‹ˆ๋‹ค: \min_{\mathbf{x},\mathbf{w}} L_{fc}(\mathbf{x}, \mathbf{w}, \mathbf{x}_h, O) = L_d(\mathbf{x}, \mathbf{x}_h) + || G \sum_{j=1}^n w_j e_j ||^2 - \text{ReLU}(G G^T - \epsilon I_{6 \times 6}) + \text{ReLU}(-\mathbf{w}) + || \text{SDF}(O, \mathbf{x}) - \epsilon || ์—ฌ๊ธฐ์„œ:
      • G = \begin{bmatrix} I_{3 \times 3} & \cdots & I_{3 \times 3} \\ S(\mathbf{x}_0) & \cdots & S(\mathbf{x}_n) \end{bmatrix} \in \mathbb{R}^{6 \times 3(n+1)}
      • S(\mathbf{x}) = \begin{bmatrix} 0 & -x_z & x_y \\ x_z & 0 & -x_x \\ -x_y & x_x & 0 \end{bmatrix} \in \mathbb{R}^{3 \times 3}๋Š” contact forces๋ฅผ wrenches๋กœ mappingํ•˜๋Š” cross product matrix์ž…๋‹ˆ๋‹ค.
      • G G^T \succeq \epsilon I_{6 \times 6}๋Š” wrench space๊ฐ€ ์„ ํ˜•์ ์œผ๋กœ ๋…๋ฆฝ์ ์ด๋ฉฐ force closure๋ฅผ ์œ„ํ•œ full-rank grasp์ž„์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค.
      • \mathbf{f} = \sum_{j=1}^n w_j e_j (e_j๋Š” ์ •๊ทœ n-๋ณ€ ๋‹ค๊ฐํ˜• ๊ทผ์‚ฌ์˜ edge, \sum w_j = 1, w_j > 0)๋Š” ์„ ํ˜•ํ™”๋œ Coulomb frictional cone FC ์ œ์•ฝ์ž…๋‹ˆ๋‹ค.
      • |\text{SDF}(O, \mathbf{x}_j)| = 0๋Š” contact point๊ฐ€ ๊ฐ์ฒด ํ‘œ๋ฉด์— ์œ„์น˜ํ•˜๋„๋ก ๊ฐ•์ œํ•ฉ๋‹ˆ๋‹ค.
  4. Kinematics Optimization:
    • ๋กœ๋ด‡ configuration์„ joint values \mathbf{q}, global rotations \mathbf{R}, global translation vector \mathbf{T}๋กœ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
    • ๋กœ๋ด‡์˜ contact positions \mathbf{x}_r์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค: \begin{bmatrix} \mathbf{x}_{r1} \\ \vdots \end{bmatrix} = \mathbf{R} \begin{bmatrix} \mathbf{f}_k(\mathbf{q}_1) \\ \vdots \end{bmatrix} + \mathbf{T} ์—ฌ๊ธฐ์„œ \mathbf{f}_k(\cdot)๋Š” joint values \mathbf{q}๋ฅผ local frame์˜ contact positions์œผ๋กœ mappingํ•˜๋Š” forward kinematics์ž…๋‹ˆ๋‹ค.
    • ์ตœ์ ์˜ contact anchor \mathbf{x}^*์™€ ๋กœ๋ด‡ contact points \mathbf{x}_r ๊ฐ„์˜ correspondence๋ฅผ ํ™•๋ฆฝํ•˜๊ธฐ ์œ„ํ•ด ๊ฐ optimization ๋‹จ๊ณ„์—์„œ LA-ICP (Linear Assignment Iterative Closest Point) ๊ธฐ๋ฐ˜์˜ ์ •๋ ฌ(alignment) ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
    • ์ฃผ์–ด์ง„ ๋ชฉํ‘œ contact anchors \mathbf{x}^*์— ๋Œ€ํ•ด ์ตœ์ ์˜ ๋กœ๋ด‡ configuration \mathbf{q}, \mathbf{R}, \mathbf{T}๋ฅผ ์ถ”์ •ํ•˜๋Š” optimization objective๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค: \min_{\mathbf{q},\mathbf{R},\mathbf{T}} L_k(\mathbf{q}, \mathbf{R}, \mathbf{T}, \mathbf{x}^*) = || \mathbf{x}^* - \mathbf{R} \cdot \mathbf{f}_k(\mathbf{q}) + \mathbf{T} ||^2 + \text{ReLU}(\mathbf{q} - \overline{\mathbf{q}}) + \text{ReLU}(\underline{\mathbf{q}} - \mathbf{q}) ์—ฌ๊ธฐ์„œ \overline{\mathbf{q}}์™€ \underline{\mathbf{q}}๋Š” ๊ฐ๊ฐ joint limit์˜ ์ƒํ•œ๊ณผ ํ•˜ํ•œ์ž…๋‹ˆ๋‹ค.
    • GenDexGrasp์—์„œ ๋„์ž…๋œ pointwise surface-normal-based penetration detection ์ „๋žต์„ ์‚ฌ์šฉํ•˜์—ฌ ๋กœ๋ด‡ contact surface์™€ ๊ฐ์ฒด surface ์‚ฌ์ด์˜ signed distance๋ฅผ ๊ณ„์‚ฐ, ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํ˜„์‹ค์ ์ธ grasp๋ฅผ ์œ ์ง€ํ•˜๊ณ  interpenetrating configuration์— ํŽ˜๋„ํ‹ฐ๋ฅผ ๋ถ€๊ณผํ•ฉ๋‹ˆ๋‹ค.

ํ‰๊ฐ€ ๋ฐ ๊ฒฐ๊ณผ:

GenHand๋Š” DexYCB ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ hand-object reconstruction, kinematic retargeting, physics-based simulation์˜ ์„ธ ๋‹จ๊ณ„์— ๊ฑธ์ณ ํ‰๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • Hand-object reconstruction: CDh, Errj, FSh1, FSh5, CDo, Errc, FSh5, FSh10 ์ง€ํ‘œ์—์„œ gSDF์— ๋น„ํ•ด ์•ฝ๊ฐ„ ๋‚ฎ์€ ์ •ํ™•๋„๋ฅผ ๋ณด์˜€์ง€๋งŒ, 3.5๋ฐฐ ์ ์€ ํŒŒ๋ผ๋ฏธํ„ฐ์™€ 1.9๋ฐฐ ๋น ๋ฅธ ์†๋„(17.38M, 475.57ms/iter)๋กœ ๋” ๋‚˜์€ ์„ฑ๋Šฅ-ํšจ์œจ์„ฑ trade-off๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.
  • Kinematic retargeting:
    • CDc (Chamfer Distance of Contact Regions): GenHand๋Š” baseline๊ณผ ์œ ์‚ฌํ•œ (0.1-0.2mm ์ฐจ์ด) contact-region similarity๋ฅผ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.
    • Computational time: GenHand๋Š” ์ถ”๊ฐ€์ ์ธ force-closure optimization ๋‹จ๊ณ„๋กœ ์ธํ•ด baseline๋ณด๋‹ค (29.31s vs 24.69s for Shadow Hand) ์•ฝ๊ฐ„ ๋” ๊ธด ๋Ÿฐํƒ€์ž„์„ ๊ฐ€์กŒ์ง€๋งŒ, ์ด๋Š” ๋ฌผ๋ฆฌ์  ํƒ€๋‹น์„ฑ ์ถ”๋ก ์„ ํ†ตํ•ฉํ•˜์—ฌ ๋” ๋†’์€ grasp ์•ˆ์ •์„ฑ๊ณผ ์„ฑ๊ณต๋ฅ ์— ๊ธฐ์—ฌํ–ˆ์Šต๋‹ˆ๋‹ค.
    • Net wrench residual: GenHand๋Š” ๋ชจ๋“  gripper ์œ ํ˜•๊ณผ ๋งˆ์ฐฐ ์ˆ˜์ค€์—์„œ ์ผ๊ด€๋˜๊ฒŒ ๋” ๋‚ฎ์€ net wrench residuals๋ฅผ ๋‹ฌ์„ฑํ•˜์—ฌ ๋” ์•ˆ์ •์ ์ด๊ณ  ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•œ grasp configuration์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค. Shadow Hand์˜ ๊ฒฝ์šฐ 26.77์—์„œ 0.45๋กœ, Robotiq gripper์˜ ๊ฒฝ์šฐ 4.44์—์„œ 0.12๋กœ ๊ฐ์†Œํ–ˆ์Šต๋‹ˆ๋‹ค.
    • SDF value residual: GenHand๋Š” ๋ชจ๋“  gripper ์œ ํ˜•๊ณผ ๋งˆ์ฐฐ ๊ณ„์ˆ˜์—์„œ ์ผ๊ด€๋˜๊ฒŒ ๋” ๋‚ฎ์€ SDF value residuals๋ฅผ ๋‹ฌ์„ฑํ•˜์—ฌ ๋” ๋†’์€ contact accuracy๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค. Shadow, Allegro, Barrett์˜ ๊ฒฝ์šฐ 0.35cm ๋ฏธ๋งŒ์ด์—ˆ์ง€๋งŒ, baseline์€ 1.34์—์„œ 1.82cm์˜€์Šต๋‹ˆ๋‹ค.
    • Distance residual: ๊ณ ์ž์œ ๋„(high-DOF) gripper์ผ์ˆ˜๋ก ๊ฑฐ๋ฆฌ ์ž”์ฐจ๊ฐ€ ๋‚ฎ๊ฒŒ ์œ ์ง€๋˜์–ด kinematic optimization ๋‹จ๊ณ„๊ฐ€ ์˜๋„๋œ force-stable contact arrangement๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๋ณด์กดํ•จ์„ ๋‚˜ํƒ€๋ƒˆ์Šต๋‹ˆ๋‹ค. Shadow Hand๋Š” 0.28cm, Allegro๋Š” 0.39cm, Barrett๋Š” 0.56cm์˜€์Šต๋‹ˆ๋‹ค.
  • Simulation: PyBullet ํ™˜๊ฒฝ์—์„œ grasp success rates๋ฅผ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.
    • GenHand+HO (์ „์ฒด ์‹œ์Šคํ…œ)๋Š” baseline (ground-truth ์ž…๋ ฅ ์ œ๊ณต ์‹œ์—๋„)์„ ๋Šฅ๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. GenHand+GT (optimization stage๋งŒ)๋Š” ๊ฐ€์žฅ ๋†’์€ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.
    • ํŠนํžˆ baseline์€ Robotiq gripper์—์„œ ํ˜„์ €ํžˆ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.
    • ์ผ๋ฐ˜์ ์ธ ๊ธฐํ•˜ํ•™์  ํ˜•ํƒœ์˜ ๊ฐ์ฒด(cylindrical cans, box-like containers)์—์„œ๋Š” ๋†’์€ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์ธ ๋ฐ˜๋ฉด, ๋ณต์žกํ•œ ํ˜•ํƒœ(scissors)๋‚˜ ์ž‘๊ฑฐ๋‚˜ ์–‡์€ ๊ฐ์ฒด(bowls, mugs)์—์„œ๋Š” ๋‚ฎ์€ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.

๊ฒฐ๋ก :

GenHand๋Š” ์ธ๊ฐ„ ์† grasping์„ ๋‹ค์–‘ํ•œ end-effector ์œ ํ˜•์— ๋Œ€ํ•ด ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•˜๊ณ  ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ๋กœ๋ด‡ grasp๋กœ retargetingํ•˜๋Š” ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ์ž…๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ MANO์™€ DeepSDF๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ RGB ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ hand-object interaction์„ ๋ชจ๋ธ๋งํ•˜๊ณ , ๋น„์ง€๋„ clustering ๊ธฐ๋ฐ˜์˜ contact ๋ถ„์„์„ ํ†ตํ•ด ์ธ๊ฐ„ grasp ์˜๋„๋ฅผ ์ถ”์ƒํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ถ”์ƒํ™”๋œ contact anchors๋Š” force-closure optimization์„ ํ†ตํ•ด ์ •๊ตํ™”๋˜๊ณ , ์ตœ์ข…์ ์œผ๋กœ kinematic optimization ๋‹จ๊ณ„๋ฅผ ํ†ตํ•ด ๋กœ๋ด‡์˜ mechanical constraints๋ฅผ ์ค€์ˆ˜ํ•˜๋ฉด์„œ ์•ˆ์ •์ ์ธ contact anchors๋ฅผ ์‹คํ˜„ํ•˜๋Š” ๋กœ๋ด‡ configuration์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. GenHand๋Š” ๊ธฐ์กด kinematic retargeting baseline์„ ๋Šฅ๊ฐ€ํ•˜๋ฉฐ, ํŠนํžˆ key-vector ๊ธฐ๋ฐ˜ retargeting์ด ์–ด๋ ค์šด ์ €์ž์œ ๋„(low-DOF) gripper์—์„œ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.


๐Ÿ”” Ring Review

๐Ÿ”” Ring โ€” An idea that echoes. Grasp the core and its value.

์„œ๋ก 

์‚ฌ๋žŒ์˜ ์†์€ ์ •๋ง ๋†€๋ผ์šด ๋„๊ตฌ์ž…๋‹ˆ๋‹ค. ์ปต์„ ์žก๋“ , ๊ฐ€์œ„๋ฅผ ์ฅ๋“ , ๋ง์น˜๋ฅผ ํœ˜๋‘๋ฅด๋“ , ์šฐ๋ฆฌ๋Š” ๊ฑฐ์˜ ์˜์‹ํ•˜์ง€ ์•Š๊ณ ๋„ ์•ˆ์ •์ ์œผ๋กœ ๋ฌผ์ฒด๋ฅผ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๋กœ๋ด‡๊ณตํ•™์—์„œ๋Š” ์ด๋Ÿฐ ์ธ๊ฐ„์˜ ์†๋™์ž‘์„ ์นด๋ฉ”๋ผ๋กœ ๊ด€์ฐฐํ•ด ๋กœ๋ด‡ ์†(๊ทธ๋ฆฌํผ)์œผ๋กœ โ€œ์˜ฎ๊ฒจ ๋‹ด๋Š”โ€ ์ž‘์—…์„ kinematic retargeting(์šด๋™ํ•™์  ๋ฆฌํƒ€๊ฒŒํŒ…)์ด๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค. ์›๊ฒฉ์กฐ์ž‘(teleoperation)์ด๋‚˜ ์‚ฌ๋žŒ ์‹œ์—ฐ์œผ๋กœ๋ถ€ํ„ฐ ๋ฐฐ์šฐ๋Š” ๋ชจ๋ฐฉ ํ•™์Šต(imitation learning)์˜ ํ•ต์‹ฌ ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„์ฃ .

๋ฌธ์ œ๋Š” ์‚ฌ๋žŒ ์†๊ณผ ๋กœ๋ด‡ ์†์˜ ํ˜•ํƒœ(morphology)๊ฐ€ ๋„ˆ๋ฌด ๋‹ค๋ฅด๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์‚ฌ๋žŒ ์†์€ ๋‹ค์„ฏ ์†๊ฐ€๋ฝ์— ์ˆ˜์‹ญ ๊ฐœ์˜ ์ž์œ ๋„๋ฅผ ๊ฐ€์ง€์ง€๋งŒ, ์‚ฐ์—…์šฉ ํ‰ํ–‰ ์ง‘๊ฒŒ(parallel-jaw gripper)๋Š” ์†๊ฐ€๋ฝ์ด ๋‘ ๊ฐœ๋ฟ์ž…๋‹ˆ๋‹ค. ์†๊ฐ€๋ฝ ๊ฐœ์ˆ˜๋„, ํฌ๊ธฐ๋„, ๊ด€์ ˆ ๊ตฌ์กฐ๋„ ๋‹ค๋ฅธ ๋กœ๋ด‡์—๊ฒŒ โ€œ์‚ฌ๋žŒ์ด ํ–ˆ๋˜ ๊ทธ๋Œ€๋กœโ€ ๋”ฐ๋ผ ํ•˜๋ผ๊ณ  ๊ฐ•์š”ํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ ๊นŒ์š”? ์† ๋ชจ์–‘์€ ๋น„์Šทํ•ด ๋ณด์ผ์ง€ ๋ชฐ๋ผ๋„, ์ •์ž‘ ๋ฌผ์ฒด๋ฅผ ๋“ค์–ด ์˜ฌ๋ฆฌ๋ ค๋Š” ์ˆœ๊ฐ„ ๋ฏธ๋„๋Ÿฌ์ง€๊ฑฐ๋‚˜ ๋–จ์–ด๋œจ๋ฆฝ๋‹ˆ๋‹ค.

๊ธฐ์กด ๋ฆฌํƒ€๊ฒŒํŒ… ๋ฐฉ๋ฒ•์€ ํฌ๊ฒŒ ๋‘ ๊ฐˆ๋ž˜์˜€์Šต๋‹ˆ๋‹ค.

  • Hand-oriented(์† ์ค‘์‹ฌ) / key-vector ๋ฐฉ์‹: ์‚ฌ๋žŒ์˜ ์†๊ฐ€๋ฝ ๋(fingertip) ์œ„์น˜๋ฅผ ๋กœ๋ด‡ ์†๊ฐ€๋ฝ ๋์— ์ตœ๋Œ€ํ•œ ์ผ์น˜์‹œํ‚ค๋„๋ก ์ตœ์ ํ™”ํ•ฉ๋‹ˆ๋‹ค. โ€œ์† ๋ชจ์–‘์„ ๋ฒ ๋ผ๋Š”โ€ ์ ‘๊ทผ์ž…๋‹ˆ๋‹ค. ์ง๊ด€์ ์ด๊ณ  ๋น ๋ฅด์ง€๋งŒ, ์น˜๋ช…์ ์ธ ์•ฝ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌผ์ฒด์˜ ๊ธฐํ•˜ ์ •๋ณด๋ฅผ ์ „ํ˜€ ๋ณด์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์† ๋ชจ์–‘๋งŒ ํ‰๋‚ด ๋‚ด๋‹ค ๋ณด๋‹ˆ ์ ‘์ด‰์ ์ด ๋ฌผ์ฒด ํ‘œ๋ฉด์—์„œ ๋–  ์žˆ๊ฑฐ๋‚˜, ์†๊ฐ€๋ฝ ๊ฐœ์ˆ˜๊ฐ€ ๋‹ค๋ฅธ ๊ทธ๋ฆฌํผ์—์„œ๋Š” ์•ˆ์ •์„ฑ์ด ๋ฌด๋„ˆ์ง‘๋‹ˆ๋‹ค.
  • Object-oriented(๋ฌผ์ฒด ์ค‘์‹ฌ) ๋ฐฉ์‹: ๋ฌผ์ฒด ํ‘œ๋ฉด์˜ ์–ด๋А ์˜์—ญ์— ์ ‘์ด‰ํ•ด์•ผ ํ•˜๋Š”์ง€๋ฅผ ๋ชจ๋ธ๋งํ•˜๊ณ , ๋กœ๋ด‡์ด ๊ทธ ์˜์—ญ์„ ๋ฎ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ํ‘œ๋ฉด ๊ธฐํ•˜์—๋Š” ๋” ์ž˜ ์ ์‘ํ•˜์ง€๋งŒ, ์†๊ฐ€๋ฝ์ด ์ ๊ฑฐ๋‚˜ ์†์žฌ์ฃผ(dexterity)๊ฐ€ ๋–จ์–ด์ง€๋Š” ๊ทธ๋ฆฌํผ๋กœ ์˜ฎ๊ธธ ๋•Œ๋Š” ์—ฌ์ „ํžˆ ์ ‘์ด‰ ์ž์œ ๋„๊ฐ€ ๋ถ€์กฑํ•ด ํ’ˆ์งˆ์ด ๋–จ์–ด์ง‘๋‹ˆ๋‹ค.

์ด ๋…ผ๋ฌธ GenHand์˜ ํ•ต์‹ฌ ํ†ต์ฐฐ์€ ๋‹ค์Œ ํ•œ ๋ฌธ์žฅ์œผ๋กœ ์š”์•ฝ๋ฉ๋‹ˆ๋‹ค.

โ€œ์† ๋ชจ์–‘์„ ๋ฒ ๋ผ๋Š” ๊ฒƒ(kinematic similarity)๊ณผ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ์ •์ ์ธ ์žก๊ธฐ(force closure)๋ฅผ ํ•˜๋‚˜์˜ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ์ตœ์ ํ™” ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ํ•จ๊ป˜ ํ’€์ž.โ€

๋น„์œ ํ•˜์ž๋ฉด, ๊ธฐ์กด key-vector ๋ฐฉ์‹์€ โ€œ์„ ์ƒ๋‹˜์˜ ์†๋™์ž‘์„ ๊ฑฐ์šธ์ฒ˜๋Ÿผ ๋”ฐ๋ผ ํ•˜๋Š” ํ•™์ƒโ€์ž…๋‹ˆ๋‹ค. ์† ๋ชจ์–‘์€ ๋˜‘๊ฐ™์ด ๋งŒ๋“ค์ง€๋งŒ ์ •์ž‘ ๋ฌผ์ฒด๊ฐ€ ๋–จ์–ด์ง€๋Š”์ง€๋Š” ์‹ ๊ฒฝ ์“ฐ์ง€ ์•Š์ฃ . GenHand๋Š” โ€œ์„ ์ƒ๋‹˜์ด ์™œ ๊ทธ๋ ‡๊ฒŒ ์žก์•˜๋Š”์ง€(์–ด๋А ๋ฐฉํ–ฅ์œผ๋กœ ํž˜์„ ์ฃผ๊ณ , ์–ด๋””๋ฅผ ๋ˆŒ๋Ÿฌ์•ผ ์•ˆ ๋–จ์–ด์ง€๋Š”์ง€)๋ฅผ ์ดํ•ดํ•œ ๋’ค, ์ž๊ธฐ ์† ๊ตฌ์กฐ์— ๋งž๊ฒŒ ๋‹ค์‹œ ์žก๋Š” ํ•™์ƒโ€์ž…๋‹ˆ๋‹ค.

GenHand์˜ ์ฃผ์š” ๊ธฐ์—ฌ๋Š” ์„ธ ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค.

  1. RGB ์ด๋ฏธ์ง€ ํ•œ ์žฅ์—์„œ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํƒ€๋‹นํ•œ ๋กœ๋ด‡ ์žก๊ธฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ. ํ‰ํ–‰ ์ง‘๊ฒŒ๋ถ€ํ„ฐ ๊ณ ์ž์œ ๋„ anthropomorphic hand(์ธ๊ฐ„ํ˜• ์†)๊นŒ์ง€ ๋‹ค์–‘ํ•œ ๊ทธ๋ฆฌํผ์— ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.
  2. ๋น„์ง€๋„(unsupervised) ์ ‘์ด‰ ๋ถ„์„ ์•Œ๊ณ ๋ฆฌ์ฆ˜. ์‚ฌ๋žŒ ์žก๊ธฐ๋ฅผ โ€œํž˜ ์„ฑ๋ถ„(force component)โ€๊ณผ โ€œ์ ‘์ด‰ ์„ฑ๋ถ„(contact component)โ€์ด๋ผ๋Š” ๊ตฌ์กฐํ™”๋œ ํ•˜์œ„ ํ‘œํ˜„์œผ๋กœ ์ถ”์ƒํ™”ํ•ด, ์†์žฌ์ฃผ๊ฐ€ ๋ถ€์กฑํ•œ ๊ทธ๋ฆฌํผ์—๋„ ์ ์‘ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
  3. LA-ICP ๋งค์นญ์„ ๋ฃจํ”„ ์•ˆ์— ๋„ฃ์€ ์šด๋™ํ•™ ์ตœ์ ํ™”. ์ ‘์ด‰์  ๋ฐฐ์น˜์™€ ๋กœ๋ด‡ ์ž์„ธ๋ฅผ ๋™์‹œ์— ๋‹ค๋“ฌ์–ด ์•ˆ์ •์ ์ธ ์žก๊ธฐ๋ฅผ ์‹คํ˜„ํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์„ฑ๊ณผ๋ฅผ ๋ฏธ๋ฆฌ ๋งํ•˜๋ฉด, GenHand๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ 4์ข…์˜ ๊ทธ๋ฆฌํผ์™€ 20๊ฐœ ๋ฌผ์ฒด์— ๊ฑธ์ณ key-vector ๋ฒ ์ด์Šค๋ผ์ธ ๋Œ€๋น„ ์„ฑ๊ณต๋ฅ ์„ 39.8% ํ–ฅ์ƒ์‹œ์ผฐ๊ณ , net wrench residual(์ž”์—ฌ ํ•ฉ๋ ฅ/ํ† ํฌ)๊ณผ ํ‘œ๋ฉด ์ ‘์ด‰ ์ผ๊ด€์„ฑ์—์„œ ๋ชจ๋‘ ์šฐ์œ„๋ฅผ ๋ณด์ด๋ฉด์„œ๋„ ์žก๊ธฐ ์œ ์‚ฌ๋„(grasp similarity)๋Š” ๋น„์Šทํ•œ ์ˆ˜์ค€์„ ์œ ์ง€ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋ฐฉ๋ฒ•

GenHand๋Š” ๊ทธ๋ฆผ 1์— ๋ฌ˜์‚ฌ๋œ ๋Œ€๋กœ ์„ธ ๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ์€ ์‚ฌ๋žŒ์ด ๋ฌผ์ฒด๋ฅผ ์žก๊ณ  ์žˆ๋Š” RGB ์ด๋ฏธ์ง€ ํ•œ ์žฅ์ด๊ณ , ์ถœ๋ ฅ์€ ๋กœ๋ด‡ ๊ทธ๋ฆฌํผ์˜ ์„ค์ •๊ฐ’ \{R, T, q\}์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ q๋Š” ๊ด€์ ˆ ๊ฐ’(joint values), R์€ ์ „์—ญ ํšŒ์ „(global rotation), T๋Š” ์ „์—ญ ๋ณ‘์ง„(translation)์ž…๋‹ˆ๋‹ค.

์ „์ฒด ํ๋ฆ„์„ ๋‹ค์ด์–ด๊ทธ๋žจ์œผ๋กœ ๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

flowchart TD
    A[Input RGB image of human grasp] --> B[Stage 1: Hand-Object Modelling]
    B --> B1[ResNet-18 encoder]
    B1 --> B2[Hand branch: MANO params theta_h, beta_h -> hand mesh + joints]
    B1 --> B3[Object branch: object center + SDF values -> object mesh]
    B2 --> C[Stage 2: Contact Anchor + Force Closure]
    B3 --> C
    C --> C1[Extract fingertip contacts + normals, filter by SDF and antipodal]
    C1 --> C2[HDBSCAN clustering on normals -> force components]
    C1 --> C3[HDBSCAN clustering on positions -> contact components]
    C2 --> C4[Hierarchical anchor assignment by gripper capacity]
    C3 --> C4
    C4 --> C5[Differentiable force-closure optimisation -> optimal contacts x*]
    C5 --> D[Stage 3: Kinematics Optimisation]
    D --> D1[LA-ICP correspondence between robot links and x*]
    D1 --> D2[Optimise q, R, T under joint limits and collision]
    D2 --> E[Physically plausible robot grasp]
    E --> F[PyBullet simulation validation]

1๋‹จ๊ณ„: ์†-๋ฌผ์ฒด ์ ‘์ด‰ ๋ชจ๋ธ๋ง (Hand-Object Contact Modelling)

๋จผ์ € ์ด๋ฏธ์ง€์—์„œ ์†๊ณผ ๋ฌผ์ฒด์˜ 3D ๊ธฐํ•˜๋ฅผ ๋ณต์›ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. GenHand๋Š” AlignSDF์˜ ์„ค๊ณ„๋ฅผ ๋”ฐ๋ผ ๋‘ ๊ฐˆ๋ž˜(dual-branch) ๊ตฌ์กฐ๋ฅผ ์”๋‹ˆ๋‹ค(๊ทธ๋ฆผ 10).

  • ์ธ์ฝ”๋”: ResNet-18์ด 256ร—256์œผ๋กœ ์ž˜๋ฆฐ ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ๋‹ค์ค‘ ์Šค์ผ€์ผ ํŠน์ง• ๋ฒกํ„ฐ๋ฅผ ๋ฝ‘์Šต๋‹ˆ๋‹ค.
  • Hand branch: MLP๊ฐ€ MANO ๋ชจ๋ธ์˜ ํฌ์ฆˆ ํŒŒ๋ผ๋ฏธํ„ฐ \theta_h์™€ ํ˜•์ƒ ํŒŒ๋ผ๋ฏธํ„ฐ \beta_h๋ฅผ ํšŒ๊ท€ํ•ฉ๋‹ˆ๋‹ค. MANO๋Š” ์‚ฌ๋žŒ ์†์„ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ํŒŒ๋ผ๋ฉ”ํŠธ๋ฆญ ๋ชจ๋ธ๋กœ ํ‘œํ˜„ํ•ด, ์ด ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ๋ถ€ํ„ฐ ์† ๋ฉ”์‹œ ์ •์  v_h์™€ ๊ด€์ ˆ ์œ„์น˜ j_h๋ฅผ ๊ณง๋ฐ”๋กœ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • Object branch: ํ•œ์ชฝ์€ transposed convolution์œผ๋กœ ๋ฌผ์ฒด ์ค‘์‹ฌ t_o๋ฅผ ์˜ˆ์ธกํ•˜๊ณ , ๋‹ค๋ฅธ ์ชฝ์€ ์„ ํ˜• ๋ ˆ์ด์–ด ์Šคํƒ์ด SDF(signed distance field, ๋ถ€ํ˜ธ ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜) ๊ฐ’์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ์€ ์ƒ˜ํ”Œ ์ขŒํ‘œ + ์ถ”์ •๋œ ๋ฌผ์ฒด ์ค‘์‹ฌ + ์ธ์ฝ”๋” ํŠน์ง•์„ ๊ฒฐํ•ฉํ•œ ์ฆ๊ฐ• ๋ฒกํ„ฐ์ž…๋‹ˆ๋‹ค.

์ง๊ด€์ ์œผ๋กœ, SDF๋Š” โ€œ๊ณต๊ฐ„์˜ ์ž„์˜ ํ•œ ์ ์ด ๋ฌผ์ฒด ํ‘œ๋ฉด์—์„œ ์–ผ๋งˆ๋‚˜ ๋–จ์–ด์ ธ ์žˆ๋Š”๊ฐ€(ํ‘œ๋ฉด ์•ˆ์ชฝ์€ ์Œ์ˆ˜, ๋ฐ”๊นฅ์ชฝ์€ ์–‘์ˆ˜, ํ‘œ๋ฉด ์œ„๋Š” 0)โ€๋ฅผ ์•Œ๋ ค์ฃผ๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค. ์ด๊ฒŒ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋‚˜์ค‘์— โ€œ์ ‘์ด‰์ ์„ ํ‘œ๋ฉด ์œ„(SDF=0)์— ์˜ฌ๋ ค๋†“์•„๋ผโ€๋Š” ์ œ์•ฝ์„ ๋ถ€๋“œ๋Ÿฌ์šด ์ตœ์ ํ™” ํ•ญ์œผ๋กœ ๋„ฃ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•™์Šต ์†์‹ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋ฉ๋‹ˆ๋‹ค. ์˜ˆ์ธก๊ฐ’ \mathbf{p}=\{\widehat{sdf_o}, \widehat{t_o}, \widehat{j_h}, (\theta_h, \beta_h)\}, ์ •๋‹ต \mathbf{g}=\{sdf_o, t_o, j_h\}์— ๋Œ€ํ•ด

L_{HO}(\mathbf{p}, \mathbf{g}) = \lambda_{sdf_o}\,|\widehat{sdf_o} - sdf_o| + \lambda_t\,\|\widehat{t_o} - t_o\|_2 + \lambda_j\,\|\widehat{j_h} - j_h\|_2 + \lambda_h\,\|(\theta_h, \beta_h)\|_2 .

๋งˆ์ง€๋ง‰ ํ•ญ์€ MANO ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋Œ€ํ•œ ์ •๊ทœํ™”๋กœ, ๋น„ํ˜„์‹ค์ ์ธ ์† ๋ณ€ํ˜•์„ ๋ง‰์Šต๋‹ˆ๋‹ค. PyTorch + Adam์œผ๋กœ ํ•™์Šต๋ฅ  1\times10^{-4}์—์„œ ์‹œ์ž‘ํ•ด 500 ์—ํญ๋งˆ๋‹ค ์ ˆ๋ฐ˜์œผ๋กœ ๊ฐ์‡ ํ•˜๋ฉฐ 1600 ์—ํญ, RTX 6000 ๋‘ ์žฅ์—์„œ ์•ฝ 60์‹œ๊ฐ„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ์ฃผ๋ชฉํ•  ์„ค๊ณ„ ์„ ํƒ: ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์€ ์†๊ณผ ๋ฌผ์ฒด๋ฅผ ๋‘˜ ๋‹ค SDF๋กœ ๋ณต์›ํ–ˆ์ง€๋งŒ, GenHand๋Š” ๋ฌผ์ฒด๋งŒ SDF๋กœ ๋ณต์›ํ•˜๊ณ  ์†์€ MANO๋กœ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ 3.5๋ฐฐ ์ ๊ณ , ๋ฐ˜๋ณต๋‹น ์†๋„๊ฐ€ 1.9๋ฐฐ ๋น ๋ฆ…๋‹ˆ๋‹ค. ์‹ค์‹œ๊ฐ„ ์›๊ฒฉ์กฐ์ž‘์— ์ค‘์š”ํ•œ ์ ˆ์ถฉ์ž…๋‹ˆ๋‹ค. (gSDF๊ฐ€ ์ผ๋ถ€ ๋ณต์› ์ •ํ™•๋„ ์ง€ํ‘œ์—์„  ์•ฝ๊ฐ„ ๋” ์ข‹์ง€๋งŒ, GenHand๋Š” ์ •ํ™•๋„-ํšจ์œจ ๊ท ํ˜•์„ ํƒํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.)

2๋‹จ๊ณ„: ์ ‘์ด‰ ์•ต์ปค ๊ตฌ์„ฑ๊ณผ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ Force Closure

์ด ๋‹จ๊ณ„๊ฐ€ GenHand์˜ ์‹ฌ์žฅ์ž…๋‹ˆ๋‹ค. ๋‘ ๊ฐ€์ง€ ๊ฐ€์ •์„ ๋‘ก๋‹ˆ๋‹ค. (1) ์ž…๋ ฅ ์‚ฌ๋žŒ ์žก๊ธฐ๋Š” ์ด๋ฏธ force closure๋ฅผ ๋งŒ์กฑํ•œ๋‹ค(์„ฑ๊ณตํ•œ ์‹œ์—ฐ์—์„œ ์ƒ˜ํ”Œ๋งํ–ˆ์œผ๋ฏ€๋กœ). (2) ์ ‘์ด‰์€ ์ฃผ๋กœ ์†๊ฐ€๋ฝ ๋์—์„œ ์ผ์–ด๋‚œ๋‹ค.

(a) ์œ ํšจ ์ ‘์ด‰์  ์ถ”์ถœ ๋ฐ ํ•„ํ„ฐ๋ง. MANO ์† ๋ฉ”์‹œ์—์„œ ์†๊ฐ€๋ฝ ๋ ์ ‘์ด‰์ ๊ณผ ํ‘œ๋ฉด ๋ฒ•์„ ์„ ๋ฝ‘๊ณ , ๊ฐ ์ ์—์„œ ๋ฌผ์ฒด์˜ SDF๋ฅผ ์งˆ์˜ํ•ฉ๋‹ˆ๋‹ค. ๋‘ ์กฐ๊ฑด์„ ํ†ต๊ณผํ•œ ์ ๋งŒ ์œ ํšจ ์ ‘์ด‰์ ์œผ๋กœ ๋‚จ๊น๋‹ˆ๋‹ค. - ๊ทผ์ ‘ ์กฐ๊ฑด: SDF ๊ฐ’์œผ๋กœ ๋ณธ ๋ฌผ์ฒด ํ‘œ๋ฉด๊นŒ์ง€ ๊ฑฐ๋ฆฌ๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ด๋‚ด. - Antipodal(๋Œ€ํ–ฅ) ์กฐ๊ฑด: ์† ์ ‘์ด‰ ๋ฒ•์„ ๊ณผ ๋ฌผ์ฒด ํ‘œ๋ฉด ๋ฒ•์„ ์ด ์ •๋ ฌ๋˜์–ด ์žˆ์Œ(์„œ๋กœ ๋งž๋ˆ„๋ฅด๋Š” ํ˜•ํƒœ).

(b) HDBSCAN ์ด์ค‘ ํด๋Ÿฌ์Šคํ„ฐ๋ง. ๋ฐ€๋„ ๊ธฐ๋ฐ˜ ๋น„์ง€๋„ ํด๋Ÿฌ์Šคํ„ฐ๋ง์ธ HDBSCAN์„ ๋‘ ๋ฒˆ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. - ๋ฒ•์„ ์— ๋Œ€ํ•œ ํด๋Ÿฌ์Šคํ„ฐ๋ง โ†’ ์žก๊ธฐ ์•ˆ์˜ ์ง€๋ฐฐ์ ์ธ ํž˜ ๋ฐฉํ–ฅ(force components)์„ ์ฐพ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ ๊ตฐ์ง‘ ์ค‘์‹ฌ์€ ํ•ด๋‹น ์ ‘์ด‰ ์œ„์น˜๋กœ ๊ณ„์‚ฐํ•ด, โ€œ์–ด๋””์— ์ฃผ๋œ ํž˜์ด ๊ฐ€ํ•ด์ง€๋Š”์ง€โ€๋ฅผ ๊ตญ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค. - ์œ„์น˜์— ๋Œ€ํ•œ ํด๋Ÿฌ์Šคํ„ฐ๋ง โ†’ ์žก๊ธฐ์˜ ๊ณต๊ฐ„์  ๋ฐฐ์น˜(contact components)๋ฅผ ํฌ์ฐฉํ•ฉ๋‹ˆ๋‹ค.

์ด ๋‘˜์€ ๊ณ„์ธต์  ๊ด€๊ณ„์ž…๋‹ˆ๋‹ค. ์ฆ‰ ํ•˜๋‚˜์˜ force component(ํž˜ ๋ฐฉํ–ฅ) ์•„๋ž˜์— ์—ฌ๋Ÿฌ contact component(์ ‘์ด‰ ์œ„์น˜)๊ฐ€ ์†ํ•ฉ๋‹ˆ๋‹ค. ๋…ผ๋ฌธ ๊ทธ๋ฆผ 1์˜ ์˜ˆ์‹œ๋Š” 2๊ฐœ์˜ ํž˜ ๋ฐฉํ–ฅ๊ณผ 5๊ฐœ์˜ ์ ‘์ด‰ ์„ฑ๋ถ„์œผ๋กœ ๋ถ„ํ•ด๋ฉ๋‹ˆ๋‹ค.

๋น„์œ ํ•˜์ž๋ฉด, ์‚ฌ๋žŒ์ด ๋จธ๊ทธ์ปต์„ ์žก์„ ๋•Œ โ€œ์—„์ง€๋กœ ํ•œ์ชฝ์„ ๋ฐ€๊ณ  ๋‚˜๋จธ์ง€ ์†๊ฐ€๋ฝ์œผ๋กœ ๋ฐ˜๋Œ€์ชฝ์„ ๋ฐ›์นœ๋‹คโ€๋Š” ํฐ ๊ทธ๋ฆผ์ด 2๊ฐœ์˜ ํž˜ ๋ฐฉํ–ฅ์ด๊ณ , ๊ทธ ์•ˆ์—์„œ ๊ฒ€์ง€ยท์ค‘์ง€ยท์•ฝ์ง€๊ฐ€ ์–ด๋””๋ฅผ ๋ˆ„๋ฅด๋Š”์ง€๊ฐ€ ์ ‘์ด‰ ์„ฑ๋ถ„์ž…๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ์ถ”์ƒํ™”ํ•˜๋ฉด, ์†๊ฐ€๋ฝ์ด 2๊ฐœ๋ฟ์ธ ํ‰ํ–‰ ์ง‘๊ฒŒ์—๋Š” โ€œ2๊ฐœ์˜ ํž˜ ๋ฐฉํ–ฅโ€๋งŒ ๋–ผ์–ด๋‚ด ํ• ๋‹นํ•˜๊ณ , Shadow ๊ฐ™์€ 5์ง€ ์†์—๋Š” ์ ‘์ด‰ ์„ฑ๋ถ„๊นŒ์ง€ ๋ชจ๋‘ ํ• ๋‹นํ•˜๋Š” ์‹์œผ๋กœ ๊ทธ๋ฆฌํผ ์ž์œ ๋„์— ๋งž์ถฐ ์œ ์—ฐํ•˜๊ฒŒ ๋งคํ•‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

(c) ๊ณ„์ธต์  ์•ต์ปค ํ• ๋‹น. ๋จผ์ € ๊ฐ€์žฅ ๋šœ๋ ทํ•œ ํ•œ ์Œ์˜ ํž˜ ์„ฑ๋ถ„(๋ณดํ†ต antipodal ์žก๊ธฐ์ )์„ ๊ณ ๋ฆ…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌํผ๊ฐ€ ๋” ์†์žฌ์ฃผ๊ฐ€ ์ข‹์œผ๋ฉด, ๋ชจ๋“  ์ฃผ์š” ํž˜ ์„ฑ๋ถ„์„ ๋ฎ์„ ๋•Œ๊นŒ์ง€ ์•ต์ปค๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ , ๊ทธ๋ž˜๋„ ์ ‘์ด‰ ์—ฌ์œ ๊ฐ€ ์žˆ์œผ๋ฉด ์œ„์น˜ ๊ธฐ๋ฐ˜ ๊ตฐ์ง‘ ์ค‘์‹ฌ๊นŒ์ง€ ์ฑ„์›Œ ๋„ฃ์Šต๋‹ˆ๋‹ค.

(d) ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ Force Closure. ์ด์ œ ํ• ๋‹น๋œ ์•ต์ปค ๊ทผ์ฒ˜์—์„œ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ์ •์ ์ธ ์ ‘์ด‰์  x^*๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค. Force closure๋ž€ n๊ฐœ์˜ ์ ‘์ด‰์  x_n \in \mathbb{R}^3์ด ๋ฌผ์ฒด์— ๊ฐ€ํ•ด์ง€๋Š” ์ž„์˜์˜ ์™ธ๋ ฅ์„ ์ƒ์‡„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์กฐ๊ฑด์ž…๋‹ˆ๋‹ค. ๊ฐ ์ ‘์ด‰์˜ wrench(ํž˜-ํ† ํฌ)๋Š” \lambda = [f_n,\; f_n \times x_n]^T๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค. ๊ณ ์ „์ ์ธ ์„ ํ˜•ํ™” ๋งˆ์ฐฐ ํ”ผ๋ผ๋ฏธ๋“œ ๊ณต์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

GG^T \succcurlyeq \epsilon I_{6\times 6}, \qquad Gf = 0, f = \sum_{j=1}^{n_e} w_j e_j,\quad \sum_{j=1}^{n_e} w_j = 1,\quad w_j > 0, \qquad |SDF(\mathcal{O}_i, x_j)| = 0,

์—ฌ๊ธฐ์„œ grasp matrix G์™€ cross-product matrix S(\cdot)๋Š”

G = \begin{bmatrix} I_{3\times3} & \cdots & I_{3\times3} \\ S(x_0) & \cdots & S(x_n) \end{bmatrix} \in \mathbb{R}^{6\times 6}, \qquad S(x) = \begin{bmatrix} 0 & -x_z & x_y \\ x_z & 0 & -x_x \\ -x_y & x_x & 0 \end{bmatrix}.

์ง๊ด€์ ์œผ๋กœ ํ’€์–ด๋ณด๋ฉด, - GG^T \succcurlyeq \epsilon I_{6\times6}: wrench ๊ณต๊ฐ„์ด 6์ฐจ์›(3D ํž˜ + 3D ํ† ํฌ)์„ ๋ชจ๋‘ ๊ฝ‰ ์ฑ„์›Œ(full-rank) ์–ด๋–ค ๋ฐฉํ–ฅ์˜ ์™ธ๋ž€๋„ ๋ง‰์„ ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ํ•œ ๋ฐฉํ–ฅ์ด๋ผ๋„ ๋น„๋ฉด ๊ทธ์ชฝ์œผ๋กœ ๋ฌผ์ฒด๊ฐ€ ๋น ์ ธ๋‚˜๊ฐ‘๋‹ˆ๋‹ค. - Gf=0: ์ ‘์ด‰๋ ฅ๋“ค์˜ ํ•ฉ๋ ฅ๊ณผ ํ•ฉํ† ํฌ๊ฐ€ 0, ์ฆ‰ ์ •์  ํ‰ํ˜•. - f = \sum w_j e_j: ์ ‘์ด‰๋ ฅ์ด ์ •n๊ฐํ˜•์œผ๋กœ ๊ทผ์‚ฌํ•œ ๋งˆ์ฐฐ ์ฝ˜(Coulomb friction cone) ์•ˆ์— ๋“ค์–ด๊ฐ€์•ผ ํ•จ(๋ฏธ๋„๋Ÿฌ์ง€์ง€ ์•Š์„ ์กฐ๊ฑด). - |SDF(\mathcal{O}, x)|=0: ์ ‘์ด‰์ ์ด ์ •ํ™•ํžˆ ๋ฌผ์ฒด ํ‘œ๋ฉด ์œ„์— ์žˆ์–ด์•ผ ํ•จ.

๊ทธ๋ฆฌ๊ณ  โ€œ์‚ฌ๋žŒ๊ณผ ๋น„์Šทํ•œ ์žก๊ธฐโ€๋ฅผ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด, ๋กœ๋ด‡ ์ ‘์ด‰์  x๊ฐ€ ๋ณต์›๋œ ์‚ฌ๋žŒ ์ ‘์ด‰ ์•ต์ปค x_h ๊ทผ์ฒ˜(\epsilon ๋ฐ˜๊ฒฝ ์•ˆ)์— ๋จธ๋ฌผ๋„๋ก ํŒจ๋„ํ‹ฐ๋ฅผ ๋‘ก๋‹ˆ๋‹ค.

L_d(x, x_h) = \mathrm{ReLU}\big(\|x - x_h\|_2 - \epsilon\big).

์ด ๋ชจ๋“  ์ œ์•ฝ์„ ํ•˜๋‚˜์˜ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ๋ชฉ์ ํ•จ์ˆ˜๋กœ ํ•ฉ์นฉ๋‹ˆ๋‹ค.

\min_{x, w} L_{fc}(x, w, x_h, \mathcal{O}) = L_d(x, x_h) + \Big\| G\sum_{j=1}^{n_e} w_j e_j \Big\|_2 - \mathrm{ReLU}(GG^T - \epsilon I_{6\times6}) + \mathrm{ReLU}(-w) + \|SDF(\mathcal{O}, x) - \epsilon\|.

ReLU ํ•ญ๋“ค์ด ๋ถ€๋“ฑ์‹ ์ œ์•ฝ(full-rank, ๊ฐ€์ค‘์น˜ ์–‘์ˆ˜)์„ ๋ถ€๋“œ๋Ÿฌ์šด ํŒจ๋„ํ‹ฐ๋กœ ๋ฐ”๊ฟ”์ฃผ๋Š” ๊ฒƒ์ด ํ•ต์‹ฌ ํŠธ๋ฆญ์ž…๋‹ˆ๋‹ค. ๋•๋ถ„์— ๊ฒฝ์‚ฌํ•˜๊ฐ•(Adam)์œผ๋กœ ์ „๋ถ€ ํ’€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3๋‹จ๊ณ„: ์šด๋™ํ•™ ์ตœ์ ํ™” (Kinematics Optimisation)

๋งˆ์ง€๋ง‰์œผ๋กœ force closure๊ฐ€ ์ฐพ์•„๋‚ธ ๋ชฉํ‘œ ์ ‘์ด‰์  x^*๋ฅผ ์‹ค์ œ ๋กœ๋ด‡ ๊ทธ๋ฆฌํผ๋กœ ์‹คํ˜„ํ•ฉ๋‹ˆ๋‹ค. ๋กœ๋ด‡ ์ ‘์ด‰์ ์˜ ์›”๋“œ ์ขŒํ‘œ๋Š” ์ˆœ์šด๋™ํ•™ f_k(q)์™€ ์ „์—ญ ๋ณ€ํ™˜์œผ๋กœ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.

\begin{bmatrix} x_r \\ 1 \end{bmatrix} = \begin{bmatrix} R & T \\ 0 & 1 \end{bmatrix} \begin{bmatrix} f_k(q) \\ 1 \end{bmatrix}.

์—ฌ๊ธฐ์„œ ์˜๋ฆฌํ•œ ๋ถ€๋ถ„์ด in-loop LA-ICP(Linear Assignment + Iterative Closest Point) ๋งค์นญ์ž…๋‹ˆ๋‹ค. ๋ฌธ์ œ๋Š” โ€œ๋กœ๋ด‡์˜ ์–ด๋А ๋งํฌ(์†๊ฐ€๋ฝ)๋ฅผ ์–ด๋А ๋ชฉํ‘œ ์•ต์ปค์— ๋Œ€์‘์‹œํ‚ฌ ๊ฒƒ์ธ๊ฐ€โ€๊ฐ€ ์ž๋ช…ํ•˜์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋งค ์ตœ์ ํ™” ์Šคํ…๋งˆ๋‹ค, 1. ๋กœ๋ด‡ ์ ‘์ด‰์  x_r'์™€ ๋ชฉํ‘œ ์•ต์ปค x^*๋ฅผ ์ •๊ทœํ™”ํ•˜๊ณ  ICP๋กœ ๋‘ ์ง‘ํ•ฉ์„ ๊ฐ™์€ ์ขŒํ‘œ๊ณ„์— ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค(์ „์—ญ ์ž์„ธ ์˜ค์ฐจ ์ œ๊ฑฐ). 2. ์Œ๋ณ„ ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ ํ–‰๋ ฌ์„ ๋งŒ๋“ค๊ณ  linear assignment ๋ฌธ์ œ๋ฅผ ํ’€์–ด ์ตœ์  ๋Œ€์‘ ๊ด€๊ณ„๋ฅผ ๋™์ ์œผ๋กœ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค. (๋กœ๋ด‡ ์ ‘์ด‰ ์˜์—ญ์ด ์•ต์ปค๋ณด๋‹ค ๋งŽ์œผ๋ฉด ๋Œ€ํ‘œ ๋ถ€๋ถ„์ง‘ํ•ฉ์„ ๋จผ์ € ๊ณ ๋ฆ…๋‹ˆ๋‹ค.)

์ด๋ ‡๊ฒŒ ๋Œ€์‘์„ ๋งค๋ฒˆ ๊ฐฑ์‹ ํ•˜๋ฉด์„œ, ๋‹ค์Œ ๋ชฉ์ ํ•จ์ˆ˜๋กœ q, R, T๋ฅผ ์ตœ์ ํ™”ํ•ฉ๋‹ˆ๋‹ค.

\min_{q, R, T} L_k(q, R, T, x^*) = \|x^* - R\cdot f_k(q) + T\|_2 + \mathrm{ReLU}(q - \bar{q}) + \mathrm{ReLU}(\underline{q} - q),

๋’ค์˜ ๋‘ ReLU ํ•ญ์€ ๊ด€์ ˆ ํ•œ๊ณ„ [\underline{q}, \bar{q}]๋ฅผ ๋„˜์ง€ ์•Š๋„๋ก ๊ฐ•์ œํ•ฉ๋‹ˆ๋‹ค. ์ถ”๊ฐ€๋กœ GenDexGrasp์˜ ์ ๋ณ„ ํ‘œ๋ฉด ๋ฒ•์„  ๊ธฐ๋ฐ˜ ์นจํˆฌ ๊ฒ€์ถœ(penetration detection)์„ ์ฑ„ํƒํ•ด, ์†๊ฐ€๋ฝ์ด ๋ฌผ์ฒด๋‚˜ ์ž๊ธฐ ์ž์‹ ์„ ๋šซ๊ณ  ๋“ค์–ด๊ฐ€๋Š” ๋น„ํ˜„์‹ค์  ์ž์„ธ๋ฅผ ํŒจ๋„ํ‹ฐํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ์ด ๋‹จ๊ณ„์˜ ์˜์‚ฌ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.

Input: target anchors x_star, robot model FK f_k, init R,T from MANO, init q
for step in 1..N:
    x_r = transform(R, T, f_k(q))          # robot contact points in world
    align x_r and x_star via ICP           # remove global pose mismatch
    D = pairwise_distance(x_star, x_r)
    match = linear_assignment(D)           # dynamic correspondence
    loss = || x_star - (R * f_k(q) + T) ||_2
    loss += relu(q - q_upper) + relu(q_lower - q)
    loss += penetration_penalty(q, R, T)   # GenDexGrasp normal-based
    update q, R, T via Adam on loss
return q, R, T

์‹คํ—˜

ํ‰๊ฐ€๋Š” (1) ์†-๋ฌผ์ฒด ๋ณต์›, (2) ์šด๋™ํ•™ ๋ฆฌํƒ€๊ฒŒํŒ…, (3) ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์˜ ์„ธ ์ถ•์œผ๋กœ ์ด๋ค„์กŒ์Šต๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์…‹์€ DexYCB(10๋ช… ํ”ผํ—˜์ž๊ฐ€ 20๊ฐœ ๋ฌผ์ฒด๋ฅผ ์žก๋Š” 8์ฒœ์—ฌ RGB ์˜์ƒ, 8๊ฐœ ์นด๋ฉ”๋ผ ์‹œ์ ). ์™ผ์†ยท๋น„์ ‘์ด‰ ํ”„๋ ˆ์ž„์„ ์ œ์™ธํ•˜๊ณ  ref.24์˜ ์ƒ˜ํ”Œ๋ง ํ”„๋กœํ† ์ฝœ๋กœ ํ•™์Šต 29,656 / ํ…Œ์ŠคํŠธ 5,928 ์ƒ˜ํ”Œ์„ ์ถ”๋ ธ์Šต๋‹ˆ๋‹ค(์›๋ณธ 857,000 ํ”„๋ ˆ์ž„์—์„œ).

์†-๋ฌผ์ฒด ๋ณต์›

์ง€ํ‘œ๋Š” Chamfer Distance(CD, cmยฒ, ์ž‘์„์ˆ˜๋ก ์ข‹์Œ), F-score(FS, 1/5/10mm ์ž„๊ณ„๊ฐ’, ํด์ˆ˜๋ก ์ข‹์Œ), ๊ด€์ ˆ ์˜ค์ฐจ Err_j(mm), ๋ฌผ์ฒด ์ค‘์‹ฌ ์˜ค์ฐจ Err_c(mm)์ž…๋‹ˆ๋‹ค(ํ‘œ 1).

ํ•ต์‹ฌ ๋ฐœ๊ฒฌ: gSDF๊ฐ€ ์ผ๋ถ€ ๋ณต์› ์ง€ํ‘œ์—์„œ ๊ทผ์†Œํ•˜๊ฒŒ ์•ž์„œ์ง€๋งŒ, GenHand๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ 3.5๋ฐฐ ์ ๊ณ  1.9๋ฐฐ ๋น ๋ฅธ ๊ท ํ˜•์„ ํƒํ–ˆ์Šต๋‹ˆ๋‹ค(RTX 6000 ๋‹จ์ผ GPU ๊ธฐ์ค€). ๋ฌผ์ฒด๋งŒ SDF๋กœ ๋”ฐ๋กœ ์ตœ์ ํ™”ํ•œ ๊ฒฐ๊ณผ๋Š” CD_o = 0.42, FS_{h5} = 0.70, FS_{h10} = 0.88๋กœ ๋ณด๊ณ ๋ฉ๋‹ˆ๋‹ค.

์šด๋™ํ•™ ๋ฆฌํƒ€๊ฒŒํŒ…

๋ณต์žก๋„๊ฐ€ ๋‹ค๋ฅธ 4์ข… ๊ทธ๋ฆฌํผ์—์„œ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌํผ ์†๊ฐ€๋ฝ ๊ด€์ ˆ ์ˆ˜ ํŠน์ง•
Robotiq 2F 2 6 (๋Œ€์นญ ๊ฒฐํ•ฉ) ๊ฐ€์žฅ ๋‹จ์ˆœํ•œ ํ‰ํ–‰ ์ง‘๊ฒŒ
Barrett Hand 3 12 (2๊ฐœ ํšŒ์ „, 1๊ฐœ ๊ณ ์ •) ์ค‘๊ฐ„ ์†์žฌ์ฃผ
Allegro Hand 4 16 (๋…๋ฆฝ ๊ตฌ๋™) ๊ฒฝ๋Ÿ‰ ๊ณ ์ž์œ ๋„
Shadow Hand 5 24 (20 ๊ตฌ๋™ + 4 ๊ฒฐํ•ฉ underactuated) ๊ฐ€์žฅ ์†์žฌ์ฃผ ์ข‹์€ ์ธ๊ฐ„ํ˜• ์†

๋น„๊ต ๋Œ€์ƒ์€ fingertip-to-palm, fingertip-to-fingertip ํ˜•์ƒ ์ฐจ์ด + ์†๋-๋ฌผ์ฒด์ค‘์‹ฌ ๋ฒกํ„ฐ ์ •๋ ฌ + ์† ๊ธฐ์ € ํšŒ์ „ ์ฐจ์ด๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ์ „ํ˜•์ ์ธ key-vector hand-oriented ๋ฒ ์ด์Šค๋ผ์ธ์ž…๋‹ˆ๋‹ค.

์ ‘์ด‰ ์˜์—ญ Chamfer Distance (CD_c, mm). ๋ฒ ์ด์Šค๋ผ์ธ์€ ์• ์ดˆ์— โ€œ์† ๋ชจ์–‘ ๋ณด์กดโ€์„ ์œ„ํ•ด ์„ค๊ณ„๋๋Š”๋ฐ๋„, GenHand๋Š” ๋ชจ๋“  ๊ทธ๋ฆฌํผ์—์„œ ํ‰๊ท  0.1~0.2mm ์ฐจ์ด ์ด๋‚ด์˜ ์œ ์‚ฌ๋„๋ฅผ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค(ํ‘œ 2). ์ฆ‰ ์œ ์‚ฌ๋„๋ฅผ ๊ฑฐ์˜ ํฌ์ƒํ•˜์ง€ ์•Š์œผ๋ฉด์„œ ์•ˆ์ •์„ฑ์„ ํฌ๊ฒŒ ์–ป์—ˆ๋‹ค๋Š” ๋œป์ž…๋‹ˆ๋‹ค.

๊ณ„์‚ฐ ์‹œ๊ฐ„(ํ‘œ 3). GenHand๊ฐ€ ๋ฒ ์ด์Šค๋ผ์ธ๋ณด๋‹ค ๋‹ค์†Œ ๋А๋ฆฝ๋‹ˆ๋‹ค. force closure ๋‹จ๊ณ„๊ฐ€ ์ถ”๊ฐ€๋๊ณ , ์žฌ์•ต์ปค๋ง์ด ์ดˆ๊ธฐ ์ž์„ธ์—์„œ ๋ฒ—์–ด๋‚˜๋ฉด์„œ ์šด๋™ํ•™ ๋‹จ๊ณ„์˜ ์ถฉ๋Œ ํ•ด๊ฒฐยท๋ฐฉํ–ฅ ์ตœ์ ํ™” ๋ถ€๋‹ด์ด ์ปค์ง€๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. BarrettยทRobotiq๋Š” ๊ตฌ์กฐ๊ฐ€ ๋‹จ์ˆœํ•˜๊ณ  ์ž๊ธฐ ์นจํˆฌ ์œ„ํ—˜์ด ๋‚ฎ์•„ ์นจํˆฌ ๊ฒ€์‚ฌ๋ฅผ ์ƒ๋žตํ•ด ๋” ๋น ๋ฆ…๋‹ˆ๋‹ค. ์ €์ž๋“ค์€ ์ด ์ถ”๊ฐ€ ์‹œ๊ฐ„์ด โ€œ์ค‘๋ณต ๊ณ„์‚ฐ์ด ์•„๋‹ˆ๋ผ ๋ฌผ๋ฆฌ ์ถ”๋ก ์„ ๋”ํ•œ ๋Œ€๊ฐ€โ€๋ผ๊ณ  ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

Net wrench residual(๊ทธ๋ฆผ 2). ์ ‘์ด‰๋ ฅ ์ ์šฉ ํ›„ ๋ฌผ์ฒด์— ๋‚จ๋Š” ๋ถˆ๊ท ํ˜• wrench์˜ ์ •๊ทœํ™” ํฌ๊ธฐ ํ•ฉ์œผ๋กœ, ์ž‘์„์ˆ˜๋ก ์ •์  ํ‰ํ˜•์— ๊ฐ€๊น๋‹ค(= ๋” ์•ˆ์ •์ ). ๋งˆ์ฐฐ๊ณ„์ˆ˜ \mu = 0.1 \sim 0.9 ์ „ ๋ฒ”์œ„์—์„œ ์ธก์ •ํ–ˆ์Šต๋‹ˆ๋‹ค(๋ฒ ์ด์Šค๋ผ์ธ์€ ๋‹จ์œ„ ๋งˆ์ฐฐ ์ฝ˜ ์‚ฌ์šฉ). - Shadow hand: ๋ฒ ์ด์Šค๋ผ์ธ ~26.77 โ†’ GenHand 0.45 - Robotiq: ~4.44 โ†’ 0.12

์ „ ๊ทธ๋ฆฌํผยท์ „ ๋งˆ์ฐฐ ์ˆ˜์ค€์—์„œ GenHand๊ฐ€ ์ผ๊ด€๋˜๊ฒŒ ๋‚ฎ์•˜์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ฒ ์ด์Šค๋ผ์ธ์€ ๋งˆ์ฐฐ๊ณ„์ˆ˜๊ฐ€ ๋ณ€ํ•ด๋„ ์ž”์ฐจ๊ฐ€ ๊ฑฐ์˜ ๋ณ€ํ•˜์ง€ ์•Š๋Š”๋ฐ(์ ‘์ด‰ ์•ˆ์ •์„ฑ ์ถ”๋ก ์ด ์—†์œผ๋ฏ€๋กœ), GenHand๋Š” force-closure ์ตœ์ ํ™” ๋•๋ถ„์— ๋งˆ์ฐฐ ๋ณ€ํ™”์—๋„ ๊ฒฌ๊ณ ํ–ˆ์Šต๋‹ˆ๋‹ค. ํฅ๋ฏธ๋กญ๊ฒŒ๋„ ์†๊ฐ€๋ฝ์ด ๋งŽ์„์ˆ˜๋ก ์ž”์ฐจ๊ฐ€ ์ปค์ง€๋Š” ๊ฒฝํ–ฅ์ด ๋ณด์ž…๋‹ˆ๋‹ค. ๊ณ ์ž์œ ๋„ ์†์€ ๊ด€์ ˆยท์ ‘์ด‰ ๊ฐ€๋Šฅ์„ฑ์ด ๋งŽ์•„ ํ•ด ๊ณต๊ฐ„์ด ์ปค์ง€๊ณ  ๋” ์–ด๋ ต๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋„ GenHand๋Š” ์ด ์–ด๋ ค์šด ์„ค์ •์—์„œ๋„ ๋‚ฎ์€ ์ž”์ฐจ๋ฅผ ์œ ์ง€ํ–ˆ์Šต๋‹ˆ๋‹ค.

SDF value residual(๊ทธ๋ฆผ 3, 4). ๋กœ๋ด‡ ์ ‘์ด‰ ์˜์—ญ๊ณผ ๋ฌผ์ฒด ํ‘œ๋ฉด ์‚ฌ์ด์˜ ๋ถ€ํ˜ธ ๊ฑฐ๋ฆฌ๋กœ, ์‹ค์ œ๋กœ ํ‘œ๋ฉด์— ์ž˜ ๋ถ™์–ด ์žˆ๋Š”์ง€๋ฅผ ๋ด…๋‹ˆ๋‹ค. GenHand๋Š” ShadowยทAllegroยทBarrett์—์„œ 0.35cm ์ดํ•˜๋ฅผ ์œ ์ง€ํ•œ ๋ฐ˜๋ฉด, ๋ฒ ์ด์Šค๋ผ์ธ์€ 1.34~1.82cm๋กœ ํ›จ์”ฌ ๋–  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค(์† ํ‚คํฌ์ธํŠธ๋งŒ ๋งž์ถ”๊ณ  ํ‘œ๋ฉด ์ ‘์ด‰์€ ์‹ ๊ฒฝ ์•ˆ ์“ฐ๋ฏ€๋กœ). Robotiq๋Š” ๊ธธ๊ณ  ํ‰ํ‰ํ•˜๋ฉฐ ๋ปฃ๋ปฃํ•œ ์†๋ ๋•Œ๋ฌธ์— GenHand์—์„œ๋„ ์ž”์ฐจ๊ฐ€ ๊ฐ€์žฅ ์ปธ์Šต๋‹ˆ๋‹ค.

Distance residual(๊ทธ๋ฆผ 5). ์ตœ์ข… ๋กœ๋ด‡ ์ ‘์ด‰์ด force-closure ์•ต์ปค์— ์–ผ๋งˆ๋‚˜ ๊ฐ€๊นŒ์šด์ง€. ๊ณ ์ž์œ ๋„์ผ์ˆ˜๋ก ์ž‘์•˜์Šต๋‹ˆ๋‹ค: Shadow <0.28cm, Allegro ~0.39cm, Barrett ~0.56cm. ๋ฐ˜๋ฉด Robotiq์€ ์ €๋งˆ์ฐฐ์—์„œ ์ตœ๋Œ€ ~5cm, ๊ณ ๋งˆ์ฐฐ์—์„œ ~3cm๋กœ ์ปธ์Šต๋‹ˆ๋‹ค. ํ‰ํ–‰ ์ง‘๊ฒŒ์˜ ์ œํ•œ๋œ ์ž์œ ๋„์™€ ์ตœ๋Œ€ ํŒŒ์ง€ ํญ ๋•Œ๋ฌธ์— ํฐ ์žฌ๋ฐฐํ–ฅ๊ณผ ์ ‘์ด‰์  ํƒ€ํ˜‘์ด ๋ถˆ๊ฐ€ํ”ผํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์‹œ๋ฎฌ๋ ˆ์ด์…˜

PyBullet์—์„œ ๊ฒ€์ฆํ–ˆ์Šต๋‹ˆ๋‹ค. ์˜์  net wrench๋ฅผ ๊ฐ•์ œํ•˜๋ฏ€๋กœ ์ค‘๋ ฅ์„ ๋Šฅ๋™ ์™ธ๋ž€์œผ๋กœ ์ทจ๊ธ‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌํผ๊ฐ€ pre-grasp ์ž์„ธ์—์„œ ์ ‘๊ทผโ†’ํŒŒ์ง€โ†’์ˆ˜์ง ๋“ค์–ด์˜ฌ๋ฆผ์„ ์ˆ˜ํ–‰ํ•˜๊ณ , 2์ดˆ๊ฐ„ ์•ˆ์ •์ ์œผ๋กœ ๋“ค๊ณ  ์žˆ์œผ๋ฉด ์„ฑ๊ณต์œผ๋กœ ๋ด…๋‹ˆ๋‹ค. ์„ธ ๊ตฌ์„ฑ์„ ๋น„๊ตํ–ˆ์Šต๋‹ˆ๋‹ค(ํ‘œ 4). - GenHand + HO: ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ(์ด๋ฏธ์ง€์—์„œ ์†-๋ฌผ์ฒด ๋ชจ๋ธ๋ง๊นŒ์ง€ ํฌํ•จ, ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—” GT ๋ฌผ์ฒด ๋ชจ๋ธ ์‚ฌ์šฉ). - GenHand + GT: GT ์†ยท๋ฌผ์ฒด ๋ฉ”์‹œ ์‚ฌ์šฉ(์ตœ์ ํ™” ๋‹จ๊ณ„๋งŒ ๊ฒฉ๋ฆฌ โ†’ ์ƒํ•œ ์„ฑ๋Šฅ). - Baseline + GT: ๋ฒ ์ด์Šค๋ผ์ธ์— GT ๋ฉ”์‹œ ์ œ๊ณต.

ํ•ต์‹ฌ ๊ฒฐ๊ณผ: GenHand+HO๊ฐ€ ๋ฒ ์ด์Šค๋ผ์ธ์— GT๋ฅผ ์คฌ์„ ๋•Œ๋ณด๋‹ค๋„ ์„ฑ๋Šฅ์ด ๋†’์•˜์Šต๋‹ˆ๋‹ค. ์ฆ‰ ํŒŒ์ดํ”„๋ผ์ธ ์ž์ฒด์˜ ์šฐ์œ„๊ฐ€ ์ž…๋ ฅ ํ’ˆ์งˆ ์ฐจ์ด๋ฅผ ์••๋„ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. GenHand+GT๋Š” ๋Œ€๋ถ€๋ถ„์˜ ๊ทธ๋ฆฌํผยท๋งˆ์ฐฐ์—์„œ ์ตœ๊ณ  ์„ฑ๊ณต๋ฅ (์ƒํ•œ)์„ ๋ณด์˜€๊ณ , ๋งˆ์ฐฐ๊ณ„์ˆ˜๊ฐ€ ํด์ˆ˜๋ก ์„ฑ๊ณต๋ฅ ์ด ์˜ฌ๋ž์Šต๋‹ˆ๋‹ค. ์ข…ํ•ฉ์ ์œผ๋กœ GenHand๋Š” ๋ฒ ์ด์Šค๋ผ์ธ ๋Œ€๋น„ 39.8% ํ–ฅ์ƒ์„ ๊ธฐ๋กํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฒ ์ด์Šค๋ผ์ธ์€ ํŠนํžˆ Robotiq์—์„œ ํฌ๊ฒŒ ๋ถ€์ง„ํ–ˆ๋Š”๋ฐ, ํ‚คํฌ์ธํŠธ ๊ธฐํ•˜ ์˜์กด์ด ์†๊ฐ€๋ฝ ์ˆ˜ยท๊ด€์ ˆ์ด ๋‹ค๋ฅธ ์†์œผ๋กœ ์ž˜ ์ผ๋ฐ˜ํ™”๋˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

๋ฌผ์ฒด๋ณ„ ์„ฑ๊ณต๋ฅ (๊ทธ๋ฆผ 8, \mu=0.9). ์›ํ†ตํ˜• ์บ”, ๋ฐ•์Šคํ˜• ์šฉ๊ธฐ์ฒ˜๋Ÿผ ํ‘œ๋ฉด ๋ฒ•์„ ์ด ์ž˜ ๋ถ„๋ฆฌ๋œ ๊ทœ์น™์  ๊ธฐํ•˜์—์„œ๋Š” ์„ฑ๊ณต๋ฅ ์ด ๋†’์•˜์Šต๋‹ˆ๋‹ค(๋ฒ•์„  ๊ธฐ๋ฐ˜ ํด๋Ÿฌ์Šคํ„ฐ๋ง์ด ์ž˜ ์ž‘๋™). ๋ฐ˜๋Œ€๋กœ ๊ฐ€์œ„์ฒ˜๋Ÿผ ๋ฒ•์„ ์ด ๋ฌด์ž‘์œ„๋กœ ํฉ์–ด์ง„ ๋ณต์žกํ•œ ๋ฌผ์ฒด, ๊ทธ๋ฆฌ๊ณ  ์ž‘๊ฑฐ๋‚˜ ๊ป์งˆ ๊ฐ™์€(shell-like) ๋ฌผ์ฒด๋Š” ์„ฑ๊ณต๋ฅ ์ด ๋‚ฎ์•˜์Šต๋‹ˆ๋‹ค(๊ณ ํ•ด์ƒ๋„ SDF์™€ ์ •๋ฐ€ ์ถฉ๋Œ ์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”).

๋น„ํŒ์  ๊ณ ์ฐฐ

๊ฐ•์ 

  • ๊ฐœ๋…์  ํ†ตํ•ฉ์ด ๊น”๋”ํ•ฉ๋‹ˆ๋‹ค. โ€œ์† ๋ชจ์–‘ ์œ ์‚ฌ์„ฑโ€๊ณผ โ€œ๋ฌผ๋ฆฌ์  ์•ˆ์ •์„ฑโ€์„ ๋ณ„๊ฐœ๋กœ ๋ณด๋˜ ๋‘ ์ง„์˜์„, ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ๋‹จ์ผ ์ตœ์ ํ™”๋กœ ๋ฌถ์—ˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ force closure๋ฅผ ReLU ํŒจ๋„ํ‹ฐ๋กœ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ํ’€์–ด ๊ฒฝ์‚ฌ ๊ธฐ๋ฐ˜์œผ๋กœ ํ†ตํ•ฉํ•œ ์ ์ด ์šฐ์•„ํ•ฉ๋‹ˆ๋‹ค.
  • ์ €์ž์œ ๋„ ๊ทธ๋ฆฌํผ์—์„œ ํŠนํžˆ ๊ฐ•๋ ฅํ•ฉ๋‹ˆ๋‹ค. Robotiq ๊ฐ™์€ ํ‰ํ–‰ ์ง‘๊ฒŒ๋Š” ๊ธฐ์กด key-vector ๋ฐฉ์‹์ด ๊ฐ€์žฅ ์•ฝํ•œ ์ง€์ ์ธ๋ฐ, โ€œํž˜ ๋ฐฉํ–ฅ์„ ๋จผ์ € ์ถ”์ƒํ™”ํ•˜๊ณ  ๊ทธ ๋’ค ์ ‘์ด‰ ์„ฑ๋ถ„์„ ์ฑ„์šด๋‹คโ€๋Š” ๊ณ„์ธต์  ์•ต์ปค ํ• ๋‹น์ด ์ •ํ™•ํžˆ ์ด ์•ฝ์ ์„ ๊ณต๋žตํ•ฉ๋‹ˆ๋‹ค. net wrench residual 26.77โ†’0.45 ๊ฐ™์€ ์ˆ˜์น˜๋Š” ์ธ์ƒ์ ์ž…๋‹ˆ๋‹ค.
  • ์‹ค์šฉ์  ํšจ์œจ ์ ˆ์ถฉ. ์†์€ MANO, ๋ฌผ์ฒด๋งŒ SDF๋กœ ๋ณต์›ํ•ด ํŒŒ๋ผ๋ฏธํ„ฐ 3.5๋ฐฐยท์†๋„ 1.9๋ฐฐ๋ฅผ ํ™•๋ณดํ•œ ๊ฒƒ์€ ์‹ค์‹œ๊ฐ„ ์›๊ฒฉ์กฐ์ž‘์„ ์—ผ๋‘์— ๋‘” ํ•ฉ๋ฆฌ์  ์„ ํƒ์ž…๋‹ˆ๋‹ค.
  • ๋งˆ์ฐฐ ๊ฒฌ๊ณ ์„ฑ. \mu=0.1\sim0.9 ์ „ ๋ฒ”์œ„์—์„œ ์•ˆ์ •์ ์ด๋ผ๋Š” ์ ์€ force-closure ์ถ”๋ก ์ด ์‹ค์ œ๋กœ ์ž‘๋™ํ•จ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์•ฝ์ ๊ณผ ํ•œ๊ณ„

  • ๊ธฐ๋Šฅ์  ์˜๋„(functional intent)๋ฅผ ๋ณด์กดํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ €์ž๋“ค๋„ ๋ช…์‹œํ–ˆ๋“ฏ, GenHand๋Š” โ€œ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ ๋–จ์–ด์ง€๋Š” ์žก๊ธฐโ€๋Š” ๋ณด์žฅํ•˜์ง€๋งŒ โ€œ๋„๊ตฌ๋ฅผ ์“ธ ๋•Œ ์†์žก์ด๋ฅผ ์ •ํ™•ํžˆ ์žก๋Š”โ€ ๊ฐ™์€ ๊ธฐ๋Šฅ์  ์ •๋ฐ€์„ฑ์€ ๋ณด์žฅํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋ง์น˜๋ฅผ ๋จธ๋ฆฌ ์ชฝ์œผ๋กœ ์•ˆ์ •์ ์œผ๋กœ ์ฅ์–ด๋ฒ„๋ฆฌ๋ฉด mechanically stable์ด์–ด๋„ task๋Š” ์‹คํŒจ์ž…๋‹ˆ๋‹ค. ๋„๊ตฌ ์‚ฌ์šฉยท์ •๋ฐ€ ์กฐ์ž‘ ์‘์šฉ์—์„œ๋Š” ์น˜๋ช…์ ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ง€๊ฐ ๋‹จ๊ณ„๊ฐ€ ๋ณ‘๋ชฉ์ž…๋‹ˆ๋‹ค. ํ‘œ 4๊ฐ€ ๋ณด์—ฌ์ฃผ๋“ฏ ๋ณต์› ์ •ํ™•๋„๊ฐ€ ๊ณง ์žก๊ธฐ ์„ฑ๊ณต๋ฅ ๋กœ ์ง๊ฒฐ๋ฉ๋‹ˆ๋‹ค. ์–‡์€ ๊ป์งˆ ๋ฌผ์ฒด์˜ ๋ถˆ์™„์ „ ๋ณต์›, ๊ณผ๋„ํ•˜๊ฒŒ ๋งค๋„๋Ÿฌ์›Œ์ง„ ๋ฉ”์‹œ(๊ฐ€์œ„ ๊ตฌ๋ฉ ์†Œ์‹ค), ์ž˜๋ชป๋œ MANO ํฌ์ฆˆ ์˜ˆ์ธก์ด ๋ชจ๋‘ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์‹คํŒจ๋กœ ์ด์–ด์ง‘๋‹ˆ๋‹ค.
  • ์ตœ์ ํ™” ๋‹จ๊ณ„์˜ ์ทจ์•ฝ์ . (1) ์ตœ์  ์•ต์ปค x^*๊ฐ€ ๋„ˆ๋ฌด ๊ฐ€๊นŒ์ด ๋ญ‰์น˜๋ฉด linear assignment๊ฐ€ ๋ชจํ˜ธํ•ด์ ธ ์ ‘์ด‰ ์˜ค์ •๋ ฌ์ด ์ƒ๊น๋‹ˆ๋‹ค. (2) ์นจํˆฌ ๊ฒ€์‚ฌ ํ•ด์ƒ๋„์™€ ํšจ์œจ์˜ ์ ˆ์ถฉ ๋•Œ๋ฌธ์— ์–‡์€ ๋ฌผ์ฒด์—์„œ ์ž๊ธฐ/๋ฌผ์ฒด ์นจํˆฌ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. (3) ์ž‘์€ ๋ฌผ์ฒด๋Š” ๋ฏธ์„ธํ•œ ์œ„์น˜ ๋ณ€ํ™”์—๋„ SDF๊ฐ€ ํฌ๊ฒŒ ๋ณ€ํ•ด, ํฐ step size์—์„œ ์ตœ์ ํ™”๊ฐ€ ์•ˆ์ • ํ•ด๋ฅผ ์ง€๋‚˜์ณ๋ฒ„๋ฆฝ๋‹ˆ๋‹ค(overshoot).
  • ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฒ€์ฆ์— ๊ทธ์นฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ฒฐ๊ณผ๊ฐ€ PyBullet ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ธฐ๋ฐ˜์ด๋ฉฐ, ์‹ค์ œ ๋กœ๋ด‡ ํ•˜๋“œ์›จ์–ด์—์„œ์˜ sim-to-real ๊ฒ€์ฆ์€ ๋ณด๊ณ ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ ‘์ด‰ ๋ชจ๋ธยท๋งˆ์ฐฐ ์ถ”์ •์˜ ํ˜„์‹ค ๊ฒฉ์ฐจ๊ฐ€ ์–ด๋–ป๊ฒŒ ์ž‘์šฉํ• ์ง€๋Š” ๋ฏธ์ง€์ˆ˜์ž…๋‹ˆ๋‹ค. (์ถ”์ธก) sim-to-real ๊ฒฉ์ฐจ๊ฐ€ ์ €์ž์œ ๋„ ๊ทธ๋ฆฌํผ์—์„œ ํŠนํžˆ ํด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์†๋„. force closure ์ถ”๊ฐ€๋กœ ๋ฒ ์ด์Šค๋ผ์ธ๋ณด๋‹ค ๋А๋ฆฝ๋‹ˆ๋‹ค. โ€œ์‹ค์‹œ๊ฐ„ ์›๊ฒฉ์กฐ์ž‘โ€์„ ํ‘œ๋ฐฉํ•˜์ง€๋งŒ ์ •ํ™•ํ•œ ms ๋‹จ์œ„ ์ง€์—ฐ์ด๋‚˜ ์‹ค์‹œ๊ฐ„ ๋ฃจํ”„ ํ†ตํ•ฉ ๊ฒฐ๊ณผ๋Š” ๋ณธ๋ฌธ์—์„œ ๊ฐ•์กฐ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๊ด€๋ จ ์—ฐ๊ตฌ ๋น„๊ต

  • Key-vector hand-oriented ๋ฐฉ์‹(DexPilot ๊ณ„์—ด, refs.7-13): ์†๋ ์ •๋ ฌ์— ์ง‘์ค‘ํ•ด ๋ฌผ์ฒด ๊ธฐํ•˜๋ฅผ ๋ฌด์‹œ โ†’ GenHand๊ฐ€ ์ •๋ฉด์œผ๋กœ ๊ฐœ์„ ํ•œ ๋ฒ ์ด์Šค๋ผ์ธ.
  • Object-oriented / contact ๊ธฐ๋ฐ˜(Contact2Grasp, ContactOpt ๊ณ„์—ด, refs.17-20): ์ ‘์ด‰ ์˜์—ญ ๋ชจ๋ธ๋ง์€ ํ•˜์ง€๋งŒ ์ €์ž์œ ๋„ ๊ทธ๋ฆฌํผ ์ ์‘์ด ์•ฝํ•จ. GenHand๋Š” ์—ฌ๊ธฐ์— ๊ณ„์ธต์  ์ถ”์ƒํ™” + force closure๋ฅผ ๋”ํ•ด ํ™•์žฅ.
  • ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ force closure: Liu et al.(ref.32)์˜ differentiable force closure estimator, Dai et al.(ref.33)์˜ sequential SDP๋ฅผ grasp synthesis ํ† ๋Œ€๋กœ ์ฐจ์šฉ.
  • Grasp synthesis: DexNet(antipodal ์ƒ˜ํ”Œ๋ง), GenDexGrasp(ref.35, ์นจํˆฌ ๊ฒ€์ถœ ์ „๋žต์„ ์ง์ ‘ ์ฑ„ํƒ)์™€ ๊ฐ™์€ ์ผ๋ฐ˜ํ™” ๊ฐ€๋Šฅ ์žก๊ธฐ ํ•ฉ์„ฑ ํ๋ฆ„์— ์†ํ•จ. GenHand์˜ ์ฐจ๋ณ„์ ์€ โ€œ์‚ฌ๋žŒ ์‹œ์—ฐ ์ด๋ฏธ์ง€ โ†’ ๋‹ค์–‘ํ•œ ๊ทธ๋ฆฌํผโ€๋ผ๋Š” ๋ฆฌํƒ€๊ฒŒํŒ… ๊ด€์ ์—์„œ force closure๋ฅผ ํ†ตํ•ฉํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  • ๋ณต์› ๋ฐฑ๋ณธ: DeepSDF, AlignSDF, gSDF, MANO, Obman/AtlasNet ๊ณ„๋ณด๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

GenHand๋Š” โ€œ์‚ฌ๋žŒ์˜ ์† ๋ชจ์–‘์„ ๋ฒ ๋ผ๋Š” ๊ฒƒโ€๊ณผ โ€œ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ์ •์ ์ธ ์žก๊ธฐโ€๋ผ๋Š”, ๊ทธ๋™์•ˆ ๋”ฐ๋กœ ๋‹ค๋ค„์ง€๋˜ ๋‘ ๋ชฉํ‘œ๋ฅผ ํ•˜๋‚˜์˜ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ์„ธ ๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ๊ฒฐํ•ฉํ•œ ์šด๋™ํ•™ ๋ฆฌํƒ€๊ฒŒํŒ… ํ”„๋ ˆ์ž„์›Œํฌ์ž…๋‹ˆ๋‹ค.

  1. MANO + DeepSDF๋กœ RGB ์ด๋ฏธ์ง€์—์„œ ์†-๋ฌผ์ฒด ๊ธฐํ•˜๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๋ณต์›ํ•˜๊ณ ,
  2. HDBSCAN ์ด์ค‘ ํด๋Ÿฌ์Šคํ„ฐ๋ง์œผ๋กœ ์‚ฌ๋žŒ ์žก๊ธฐ๋ฅผ ํž˜ ์„ฑ๋ถ„ยท์ ‘์ด‰ ์„ฑ๋ถ„์œผ๋กœ ์ถ”์ƒํ™”ํ•œ ๋’ค ๊ทธ๋ฆฌํผ ์ž์œ ๋„์— ๋งž์ถฐ ๊ณ„์ธต์ ์œผ๋กœ ์•ต์ปค๋ฅผ ํ• ๋‹นํ•˜๊ณ , ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ force closure๋กœ ์•ˆ์ •์  ์ ‘์ด‰์  x^*๋ฅผ ์ฐพ๊ณ ,
  3. in-loop LA-ICP ์šด๋™ํ•™ ์ตœ์ ํ™”๋กœ ๊ด€์ ˆ ํ•œ๊ณ„ยท์ถฉ๋Œ์„ ์ง€ํ‚ค๋ฉฐ ๋กœ๋ด‡ ์ž์„ธ q, R, T๋ฅผ ์‹คํ˜„ํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์„ฑ๊ณผ๋Š” 4์ข… ๊ทธ๋ฆฌํผยท20๊ฐœ ๋ฌผ์ฒดยท๋‹ค์–‘ํ•œ ๋งˆ์ฐฐ์—์„œ key-vector ๋ฒ ์ด์Šค๋ผ์ธ ๋Œ€๋น„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๊ณต๋ฅ  39.8% ํ–ฅ์ƒ, net wrench residual(์˜ˆ: Shadow 26.77โ†’0.45)๊ณผ SDF ์ ‘์ด‰ ์ž”์ฐจ์˜ ๋Œ€ํญ ๊ฐ์†Œ์ด๋ฉฐ, ๊ทธ๋Ÿฌ๋ฉด์„œ๋„ ์ ‘์ด‰ ์˜์—ญ ์œ ์‚ฌ๋„๋Š” 0.1~0.2mm ์ด๋‚ด๋กœ ์œ ์ง€ํ–ˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ๊ธฐ์กด ๋ฐฉ์‹์ด ๊ฐ€์žฅ ์•ฝํ–ˆ๋˜ ์ €์ž์œ ๋„ ํ‰ํ–‰ ์ง‘๊ฒŒ์—์„œ ์šฐ์œ„๊ฐ€ ๋‘๋“œ๋Ÿฌ์ง‘๋‹ˆ๋‹ค.

๋กœ๋ด‡๊ณตํ•™ ์‹ค๋ฌด์ž์˜ ๊ด€์ ์—์„œ GenHand๊ฐ€ ์ฃผ๋Š” ๋ฉ”์‹œ์ง€๋Š” ๋ถ„๋ช…ํ•ฉ๋‹ˆ๋‹ค. โ€œ์†๋™์ž‘์„ ๊ทธ๋Œ€๋กœ ๋ฒ ๋ผ์ง€ ๋ง๊ณ , ์™œ ๊ทธ๋ ‡๊ฒŒ ์žก์•˜๋Š”์ง€(ํž˜์˜ ๊ตฌ์กฐ)๋ฅผ ์ดํ•ดํ•œ ๋’ค ๋‚ด ์†์— ๋งž๊ฒŒ ๋‹ค์‹œ ์žก์•„๋ผ.โ€ ๋‹ค๋งŒ ๊ธฐ๋Šฅ์  ์˜๋„ ๋ณด์กด ๋ถ€์žฌ, ์ง€๊ฐ ๋‹จ๊ณ„ ๋ณ‘๋ชฉ, ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ•œ์ • ๊ฒ€์ฆ์€ ๋ถ„๋ช…ํ•œ ํ•œ๊ณ„์ด๋ฉฐ, ์ €์ž๋“ค๋„ ํ–ฅํ›„ ๊ณผ์ œ๋กœ ์ง€๊ฐ ํ’ˆ์งˆ ํ–ฅ์ƒ๊ณผ task-specific ์ œ์•ฝ ํ†ตํ•ฉ(ํŠนํžˆ ๋„๊ตฌ ์‚ฌ์šฉ ๊ฐ™์€ ์ •๋ฐ€ ์žก๊ธฐ)์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์›๊ฒฉ์กฐ์ž‘๊ณผ ๋ชจ๋ฐฉ ํ•™์Šต์˜ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„์— ๊ณง๋ฐ”๋กœ ์“ธ ์ˆ˜ ์žˆ๋Š”, ์‹ค์šฉ์ ์ด๋ฉด์„œ ์ด๋ก ์ ์œผ๋กœ๋„ ์ž˜ ์ •๋ˆ๋œ ๊ธฐ์—ฌ์ž…๋‹ˆ๋‹ค.

Copyright 2026, JungYeon Lee