Curieux.JY
  • JungYeon Lee
  • Post
  • Projects
  • Note

On this page

  • ๐Ÿ” Ping Review
  • ๐Ÿ”” Ring Review
    • 1. ๋“ค์–ด๊ฐ€๋ฉฐ: ์™œ ์ด ์—ฐ๊ตฌ๊ฐ€ ์ค‘์š”ํ•œ๊ฐ€?
    • 2. PCHands์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด
      • ํ•ต์‹ฌ ๊ธฐ์—ฌ 3๊ฐ€์ง€
    • 3. ๊ธฐ์ˆ ์  ๋ฐฉ๋ฒ•๋ก  ์‹ฌ์ธต ๋ถ„์„
      • 3.1 Anchor Description Format (ADF): ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ํ†ต์ผ๋œ ์–ธ์–ด
      • 3.2 2๋‹จ๊ณ„ ์ฐจ์› ์ถ•์†Œ: CVAE + PCA
      • 3.3 ์—”๋“œ์ดํŽ™ํ„ฐ ํ”„๋ ˆ์ž„ ์ •์ œ: ICP๋ฅผ ํ™œ์šฉํ•œ ๋ฐ˜๋ณต์  ์ •๋ ฌ
      • 3.4 ์ธ์ฝ”๋“œ-๋””์ฝ”๋“œ ํŒจ์Šค: ์‹ค์ œ ์‚ฌ์šฉ๋ฒ•
    • 4. ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ถ„์„
      • 4.1 ์‹คํ—˜ ์„ค์ •
      • 4.2 ์ •์„ฑ์  ๋ถ„์„: ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์˜ ์˜๋ฏธ
      • 4.3 ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์กฐ์ž‘ ํƒœ์Šคํฌ
      • 4.4 ๋ฐ๋ชจ ์†Œ์Šค ๊ต์ฐจ ์‹คํ—˜: ์ง„์ •ํ•œ ์ „์ด ๊ฐ€๋Šฅ์„ฑ
      • 4.5 ์‹ค์„ธ๊ณ„ ์ „์ด ์‹คํ—˜
    • 5. ๊ฐ•์ ๊ณผ ํ•œ๊ณ„ ๋ถ„์„
      • ๊ฐ•์ 
      • ํ•œ๊ณ„
    • 6. ๊ธฐ์กด ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต ๋ฐ ์œ„์น˜
      • ์ž์„ธ ์‹œ๋„ˆ์ง€ ์—ฐ๊ตฌ์˜ ๊ณ„๋ณด
      • ๋ฆฌํƒ€๊ฒŒํŒ… ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต
    • 7. ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ
      • ์ €์ž๋“ค์˜ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ
      • ์ถ”๊ฐ€๋กœ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋Š” ์—ฐ๊ตฌ ๋ฐฉํ–ฅ
    • 8. ์‹ค๋ฌด์  ์‹œ์‚ฌ์ : ์–ธ์ œ PCHands๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š”๊ฐ€?
      • ์‚ฌ์šฉ์„ ๊ถŒ์žฅํ•˜๋Š” ๊ฒฝ์šฐ
      • ์‚ฌ์šฉ์ด ์ ํ•ฉํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋Š” ๊ฒฝ์šฐ
    • 9. ๊ฒฐ๋ก 
  • โ›๏ธ Dig Review
    • ์ˆ˜ํ•™์  ๊ธฐ๋ฒ•: PCA, CVAE, ๊ทธ๋ฆฌ๊ณ  ์•ต์ปค ๊ธฐ๋ฐ˜ ๋งคํ•‘
    • JAS, CAS, ์‹œ๋„ˆ์ง€ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ๊ณผ ๋น„๊ต
    • ์‹คํ—˜ ๊ฒฐ๊ณผ์™€ ๋ถ„์„
      • ์ •์„ฑ์  ์‹œ๋„ˆ์ง€ ๋ถ„์„
      • ๊ฐ•ํ™”ํ•™์Šต ๋ฒค์น˜๋งˆํฌ
      • ์‹ค์„ธ๊ณ„ ์ ์šฉ
    • ๋น„ํŒ์  ๋ถ„์„ ๋ฐ ํ–ฅํ›„ ๊ณผ์ œ
    • ๊ฒฐ๋ก : ์‹œ๋„ˆ์ง€ ์ œ์–ด์˜ ์˜์˜์™€ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ

๐Ÿ“ƒPCHands ๋ฆฌ๋ทฐ

pca
cvae
rl
PCA-based Hand Pose Synergy Representation on Manipulators with N-DoF
Published

December 14, 2025

๐Ÿ” Ping. ๐Ÿ”” Ring. โ›๏ธ Dig. A tiered review series: quick look, key ideas, deep dive.

  • Paper Link
  • Code
  • Project
  1. ๐Ÿฆพ PCHands๋Š” ๋‹ค์–‘ํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ๊ณตํ†ต๋œ ์† ์ž์„ธ ์‹œ๋„ˆ์ง€๋ฅผ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด Anchor Description Format (ADF)๊ณผ CVAE ๋ฐ PCA๋ฅผ ๊ฒฐํ•ฉํ•œ ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
  2. ๐Ÿ”„ ์ด ๋ฐฉ๋ฒ•์€ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ์•ต์ปค ์œ„์น˜๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ€๋ณ€ ๊ธธ์ด latent representation์„ ํ•™์Šตํ•˜๊ณ , ICP๋ฅผ ํ†ตํ•œ end-effector ํ”„๋ ˆ์ž„ ์ •๋ ฌ๋กœ ๋‹ค์–‘ํ•œ DoF ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์ „๋ฐ˜์— ๊ฑธ์ณ ์ผ๊ด€๋œ ์‹œ๋„ˆ์ง€ ์ถ•์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  3. โœจ ์‹คํ—˜ ๊ฒฐ๊ณผ, PCHands๋Š” Reinforcement Learning์—์„œ observation ๋ฐ action space๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ธ์ฝ”๋”ฉํ•˜์—ฌ ํ•™์Šต ํšจ์œจ์„ฑ๊ณผ ์ผ๊ด€์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๋ฉฐ, ๋‹ค๋ฅธ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ๋ฐ๋ชจ๋ฅผ ํ™œ์šฉํ•œ robustํ•œ transfer learning์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ” Ping Review

๐Ÿ” Ping โ€” A light tap on the surface. Get the gist in seconds.

์ด ๋…ผ๋ฌธ์€ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ์กฐ์ž‘๊ธฐ(manipulator) ์ „๋ฐ˜์— ๊ฑธ์ณ dexterous manipulation์„ ์œ„ํ•œ ๊ณตํ†ต๋œ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ์ €์ž๋“ค์€ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ๊ด‘๋ฒ”์œ„ํ•œ ์กฐ์ž‘๊ธฐ๋“ค๋กœ๋ถ€ํ„ฐ ์† ์ž์„ธ ์‹œ๋„ˆ์ง€(hand postural synergy)๋ฅผ ์ถ”์ถœํ•˜๋Š” ์ƒˆ๋กœ์šด PCA ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์ธ PCHands๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ๋ฐฉ๋ฒ•๋ก  (Core Methodology)

PCHands๋Š” ์„ธ ๊ฐ€์ง€ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ์ธ Anchor Description Format (ADF), Conditional Variational Auto-Encoder (CVAE), ๊ทธ๋ฆฌ๊ณ  Principal Component Analysis (PCA)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€๋ณ€ ๊ธธ์ด(variable-length)์˜ ์† ์ž์„ธ ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

  1. Anchor Description Format (ADF) ADF๋Š” ๋‹ค์–‘ํ•œ ์กฐ์ž‘๊ธฐ(2-finger gripper๋ถ€ํ„ฐ 5-finger anthropomorphic hand๊นŒ์ง€)์˜ ๊ตฌ์„ฑ์„ ํ†ต์ผ๋œ ๋ฐฉ์‹์œผ๋กœ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
    • Anchor Placement: ๋ฏธ๋ฆฌ ์ •์˜๋œ 22๊ฐœ์˜ 3D ํฌ์ธํŠธ \alpha = \{x_i | x_i \in \mathbb{R}^3\}_{i=1}^{22}๋ฅผ ์กฐ์ž‘๊ธฐ์˜ ๊ธฐ๋Šฅ์  ๋ถ€๋ถ„์— ์ˆ˜๋™์œผ๋กœ ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, 5-finger anthropomorphic hand์˜ ๊ฒฝ์šฐ ๊ฐ ์†๊ฐ€๋ฝ์— 4๊ฐœ์˜ ์•ต์ปค(proximal, intermediate, distal, tip phalanges)์™€ ์†๋ฐ”๋‹ฅ์— 2๊ฐœ๋ฅผ ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค. 2-finger gripper์˜ ๊ฒฝ์šฐ, 4๊ฐœ์˜ ์—„์ง€ ์•ต์ปค๋Š” ์™ผ์ชฝ jaw์— ํ• ๋‹น๋˜๊ณ , ๋‚˜๋จธ์ง€ 16๊ฐœ์˜ ์†๊ฐ€๋ฝ ์•ต์ปค๋Š” ์˜ค๋ฅธ์ชฝ jaw์— ๋ณ‘ํ•ฉ(anchor-merging)๋ฉ๋‹ˆ๋‹ค. ์ด ๋ณ‘ํ•ฉ ๋ฐฉ์‹์€ ์†๊ฐ€๋ฝ ์ˆ˜๊ฐ€ ์ ์€ ์กฐ์ž‘๊ธฐ์—๋„ ์ผ๋ฐ˜ํ™”๋ฉ๋‹ˆ๋‹ค. ๊ฐ ์•ต์ปค๋Š” ํ‘œํ˜„ํ•˜๋Š” ์˜์—ญ์— ๋Œ€ํ•œ ์ƒ์ง•์  ์˜๋ฏธ๋ฅผ ์ผ๊ด€์„ฑ ์žˆ๊ฒŒ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
    • Preliminary End-effector Frame Placement: ์กฐ์ž‘๊ธฐ์˜ ์†๋ฐ”๋‹ฅ(palm) ๋˜๋Š” ๊ทธ๋ฆฌํผ ๋ฒ ์ด์Šค(gripper base)์˜ ์ค‘์•™์— ์ดˆ๊ธฐ end-effector frame์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. x์ถ•์€ ์†๋ฐ”๋‹ฅ์—์„œ ๋ฐ”๊นฅ์ชฝ์œผ๋กœ, y์ถ•์€ ์†๋ชฉ(hands) ๋˜๋Š” ์—„์ง€ jaw(grippers)๋ฅผ ํ–ฅํ•˜๋„๋ก ์„ค์ •๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ์•ต์ปค ์œ„์น˜๋Š” ์ด ํ”„๋ ˆ์ž„์— ์ƒ๋Œ€์ ์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ˜•ํƒœํ•™์  ์ฐจ์ด๋กœ ์ธํ•ด ์ด ์ดˆ๊ธฐ ํ”„๋ ˆ์ž„์€ ์กฐ์ž‘๊ธฐ๋งˆ๋‹ค ๋ถˆ์ผ์น˜๋ฅผ ๋ณด์ด๋ฏ€๋กœ, ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด iterative refinement ๊ณผ์ •(์•„๋ž˜ ์„ค๋ช…)์„ ๊ฑฐ์นฉ๋‹ˆ๋‹ค.
  2. Postural Synergy Model PCHands๋Š” ์•ต์ปค ์œ„์น˜์™€ ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜(principal component coefficients) ์‚ฌ์ด์˜ ์ธ์ฝ”๋”ฉ ๋ฐ ๋””์ฝ”๋”ฉ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” CVAE์™€ ์„ ํ˜• PCA๋ฅผ ์—ฐ๊ฒฐํ•˜์—ฌ, ์กฐ์ž‘๊ธฐ ์ž์„ธ์˜ ์ €์ฐจ์› ํ‘œํ˜„์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
    • CVAE (Conditional Variational Auto-Encoder):
      • ์•ต์ปค ์œ„์น˜ \alpha์˜ ์ฐจ์›์„ ์ค„์—ฌ ์ €์ฐจ์› latent variable z๋กœ ์ธ์ฝ”๋”ฉํ•ฉ๋‹ˆ๋‹ค (\text{dim}(z) \ll 22 \times 3).
      • CVAE์˜ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋” ๋ชจ๋‘ ์กฐ์ž‘๊ธฐ๋ฅผ ์‹๋ณ„ํ•˜๋Š” one-hot vector์— ์˜ํ•ด ์กฐ๊ฑดํ™”(conditioned)๋ฉ๋‹ˆ๋‹ค.
      • ์ด ๋ชจ๋ธ์€ ์ž…๋ ฅ ์•ต์ปค x_i์™€ ์žฌ๊ตฌ์„ฑ๋œ ์•ต์ปค \hat{x}_i ์‚ฌ์ด์˜ ๊ฐ€์ค‘ L_1 ์†์‹ค์„ ์ตœ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค: \min_{\phi, \theta} \sum_{i=1}^{22} w_i (x_i - \hat{x}_i) ์—ฌ๊ธฐ์„œ w_i๋Š” training dataset์—์„œ ์•ต์ปค ๋ณ‘ํ•ฉ(anchor merging) ์‚ฌ์šฉ์— ๋”ฐ๋ผ ๊ฒฝํ—˜์ ์œผ๋กœ ์„ค์ •๋ฉ๋‹ˆ๋‹ค. (์˜ˆ: ์—„์ง€ ์•ต์ปค์™€ ๊ฐ™์ด ๊ฑฐ์˜ ๋ณ‘ํ•ฉ๋˜์ง€ ์•Š๋Š” ์•ต์ปค์— ๋” ๋†’์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ๊ท ํ˜• ์žกํžŒ ์žฌ๊ตฌ์„ฑ์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค.)
      • Training Dataset: m๊ฐœ์˜ ์กฐ์ž‘๊ธฐ(๊ฐ ์กฐ์ž‘๊ธฐ๋‹น n=10000๊ฐœ์˜ ๊ตฌ์„ฑ ์ƒ˜ํ”Œ)๋กœ๋ถ€ํ„ฐ ์•ต์ปค ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ CVAE๋ฅผ ํ›ˆ๋ จํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ๊ตฌ์„ฑ์€ ์กฐ์ž‘๊ธฐ joint position์„ kinematics constraints ๋‚ด์—์„œ ๊ท ์ผํ•˜๊ฒŒ ์ƒ˜ํ”Œ๋งํ•˜์—ฌ ์ƒ์„ฑ๋˜๋ฉฐ, ํ•ด๋‹น ์•ต์ปค ์œ„์น˜๋Š” forward kinematics๋ฅผ ํ†ตํ•ด ๊ณ„์‚ฐ๋œ ํ›„ ์—…๋ฐ์ดํŠธ๋œ end-effector frame์œผ๋กœ ํ‘œํ˜„๋˜๊ณ  3D Cartesian space์—์„œ ๋‹จ์œ„ ๊ฐ€์šฐ์‹œ์•ˆ(unit Gaussian)์œผ๋กœ ์ •๊ทœํ™”๋ฉ๋‹ˆ๋‹ค.
    • PCA Reduction:
      • CVAE์˜ latent variable z์— ์„ ํ˜• PCA๋ฅผ ์ ์šฉํ•˜์—ฌ z์˜ ์ฐจ์›์„ ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜ z'๋กœ ์ถ”๊ฐ€์ ์œผ๋กœ ์ค„์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ €์ž๋“ค์€ ๊ฐ€๋ณ€ ๊ธธ์ด์˜ latent representation์„ ์–ป์Šต๋‹ˆ๋‹ค.
      • PCA๋ฅผ CVAE ์ดํ›„์— ์ ์šฉํ•˜๋Š” ์ด์œ ๋Š”, CVAE๊ฐ€ ์กฐ์ž‘๊ธฐ ๊ฐ„์˜ ํ˜•ํƒœํ•™์ (inter-manipulator) ๋ณ€ํ™”๋ฅผ ๋ชจ๋ธ๋งํ•˜์—ฌ, PCA๊ฐ€ ๋ชจ๋“  ์กฐ์ž‘๊ธฐ์— ๊ฑธ์นœ ์ž์„ธ ๋ณ€ํ™”(pose variation)์— ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•จ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋‹จ์ˆœํžˆ ์•ต์ปค ๊ณต๊ฐ„์— ์ง์ ‘ PCA๋ฅผ ์ ์šฉํ–ˆ์„ ๋•Œ ํ˜•ํƒœํ•™์  ์ฐจ์ด๊ฐ€ ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์„ ์ง€๋ฐฐํ•˜์—ฌ ์‹œ๋„ˆ์ง€ ํ‘œํ˜„ ๋Šฅ๋ ฅ์„ ์ €ํ•ดํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค.
    • Encode Pass: Joint values j๋ฅผ compactํ•œ ์ฃผ์„ฑ๋ถ„ ํ‘œํ˜„ z'์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค: E: j \to \alpha \to z \to z'. ์ด ๊ณผ์ •์€ forward kinematics๋ฅผ ํ†ตํ•ด j๋ฅผ \alpha๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , CVAE ์ธ์ฝ”๋”๋ฅผ ํ†ตํ•ด \alpha๋ฅผ z๋กœ ์ธ์ฝ”๋”ฉํ•œ ํ›„, PCA๋ฅผ ํ†ตํ•ด z๋ฅผ z'์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
    • Decode Pass: Compactํ•œ ์ฃผ์„ฑ๋ถ„ ํ‘œํ˜„ z'์„ joint values j๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค: D: z' \to z \to \alpha \to j. ์ด ๊ณผ์ •์€ inverse PCA๋ฅผ ํ†ตํ•ด z'์„ z๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , CVAE ๋””์ฝ”๋”๋ฅผ ํ†ตํ•ด z๋ฅผ \alpha๋กœ ์žฌ๊ตฌ์„ฑํ•œ ํ›„, multi-objective inverse kinematics๋ฅผ ํ†ตํ•ด \alpha๋ฅผ j๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
    • Separation of Synergies and Hardware: PCHands๋Š” ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ์„ forward ๋ฐ inverse kinematics๋ฅผ ๋‹ด๋‹นํ•˜๋Š” ํ•˜๋“œ์›จ์–ด ๊ณ„์ธต์œผ๋กœ๋ถ€ํ„ฐ ๋ถ„๋ฆฌํ•˜์—ฌ ํ•˜๋“œ์›จ์–ด์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š”(hardware-agnostic) ๋™์ž‘์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ํŠน์ • ์กฐ์ž‘๊ธฐ \gamma์˜ ์ž์„ธ j_\gamma๋ฅผ ๋‹ค๋ฅธ ์กฐ์ž‘๊ธฐ \nu๋กœ ๋ฆฌํƒ€๊ฒŸํŒ…(retargeting)ํ•  ๋•Œ, ๊ณตํ†ต์˜ latent representation์„ ๊ณต์œ ํ•˜๋ฉด์„œ ๊ฐ๊ฐ์˜ ํ•˜๋“œ์›จ์–ด ๊ณ„์ธต์ด ์ธ์ฝ”๋”ฉ ๋ฐ ๋””์ฝ”๋”ฉ ๊ณผ์ •์—์„œ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค: j_\nu = D_\nu(E(\gamma, j_\gamma)).
  3. Refinement of End-effector Frame ์ดˆ๊ธฐ end-effector frame์˜ ํ˜•ํƒœํ•™์  ๋ถˆ์ผ์น˜๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, PCHands๋Š” end-effector frame์„ ์ •์ œํ•˜๊ณ  ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ์„ ์žฌํ›ˆ๋ จํ•˜๋Š” ๋ฐ˜๋ณต์ ์ธ ๊ณผ์ •์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค (Algorithm 1).
    • Iterative Learning Procedure: ์ด ์ ˆ์ฐจ๋Š” ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ๋‹จ๊ณ„์™€ end-effector frame์„ ์ •์ œํ•˜๋Š” ๋‹จ๊ณ„๋ฅผ ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ์€ ํ•ญ์ƒ ์ตœ์‹  ์กฐ์ •๋œ end-effector frame์„ ์ฐธ์กฐํ•˜๋Š” ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ›ˆ๋ จ๋ฉ๋‹ˆ๋‹ค.
    • Anchors Alignment (Algorithm 2):
      • ์ด ์ •์ œ ์ ˆ์ฐจ๋Š” Robotiq-2f85, Google-gripper, Kinova-3f, Armar-hand์™€ ๊ฐ™์€ reference manipulator์˜ ์•ต์ปค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ target manipulator์˜ ์•ต์ปค๋ฅผ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค.
      • ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„(1^{st} PC)์—์„œ k๊ฐœ์˜ ๊ท ๋“ฑํ•˜๊ฒŒ ๋ถ„ํฌ๋œ ์ง€์ ๋“ค์„ ์ƒ˜ํ”Œ๋งํ•ฉ๋‹ˆ๋‹ค (์˜ˆ: z'_i = [i, 0, \dots, 0]).
      • ๊ฐ ์ง€์ ์—์„œ decode-pass๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ˜ํ”Œ๋ง๋œ ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜๋ฅผ target manipulator์™€ reference manipulator์˜ ์•ต์ปค๋กœ ๋งคํ•‘ํ•ฉ๋‹ˆ๋‹ค.
      • reference manipulator๋“ค๋กœ๋ถ€ํ„ฐ ํ‰๊ท  ์•ต์ปค ์œ„์น˜๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
      • ์ด k๊ฐœ์˜ ๊ตฌ์„ฑ์—์„œ target ์•ต์ปค์™€ reference ์•ต์ปค ๊ฐ„์˜ ์ง์ ‘์ ์ธ ๋Œ€์‘์„ ์‚ฌ์šฉํ•˜์—ฌ target manipulator์˜ end-effector frame์— ๋Œ€ํ•œ ์กฐ์ •์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
      • ์ตœ์ ์˜ rigid transformation \delta = \{R, t\} \in SE(3)๋Š” ICP (Iterative Closest Point)์˜ ๋‹จ์ผ ์Šคํ…์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค. ICP๋Š” ๋‹ค์Œ ๋ชฉ์  ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค: \min_{R, t} \sum_{i=1}^k \left\| \alpha^{\text{ref}}_i - R\alpha^{\text{tgt}}_i - t \right\|^2 ์—ฌ๊ธฐ์„œ ์†๊ฐ€๋ฝ ๋(fingertips)๊ณผ ์—„์ง€ ์•ต์ปค(thumb anchors)์—๋Š” ๋” ๊ท ํ˜• ์žกํžŒ ์ •๋ ฌ์„ ์œ„ํ•ด ๋” ๋†’์€ ๊ฐ€์ค‘์น˜๊ฐ€ ๋ถ€์—ฌ๋ฉ๋‹ˆ๋‹ค.

์‹คํ—˜ ๋ฐ ๊ฒฐ๊ณผ

  • ์ •์„ฑ์  ๋ถ„์„: PCHands๋Š” ๋‹ค์–‘ํ•œ ์กฐ์ž‘๊ธฐ์—์„œ ์ผ๊ด€๋œ ์† ์ž์„ธ ์‹œ๋„ˆ์ง€์™€ end-effector frame์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์€ ์กฐ์ž‘๊ธฐ ๊ฐ„์— ์ผ๊ด€๋˜๊ฒŒ ๋ณดํŽธ์ ์ธ โ€˜์—ด๊ธฐ-๋‹ซ๊ธฐ(opening-closing)โ€™ ๋ชจ์…˜์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ•ํ™” ํ•™์Šต(RL)์—์„œ์˜ ํ™œ์šฉ: PCHands๋Š” RL ์„ค์ •์—์„œ dexterous manipulation task๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ joint space์—์„œ ํ•™์Šตํ•˜๋Š” baseline๋ณด๋‹ค PCHands์˜ N-pc (N principal components) latent representation์„ observation ๋ฐ action space์— ์‚ฌ์šฉํ–ˆ์„ ๋•Œ ๋” ๋น ๋ฅธ ํ•™์Šต ํšจ์œจ์„ฑ๊ณผ ์ผ๊ด€์„ฑ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค. DAPG(Demo Augmented Policy Gradient)์—์„œ ์ธ๊ฐ„ ์‹œ์—ฐ(demonstrations)์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ PCHands๋Š” ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋ฅผ latent space๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ •์—์„œ ์œ ์šฉํ•œ ์ •๋ณด๊ฐ€ ๋ณด์กด๋จ์„ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ์‹œ์—ฐ ์†Œ์Šค(Source of Task Demonstrations)์— ๋Œ€ํ•œ Ablation: PCHands๋Š” ๋‹ค๋ฅธ ์กฐ์ž‘๊ธฐ๋กœ ์ˆ˜์ง‘๋œ ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ๋„ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ์‹œ์—ฐ ๋ฐ์ดํ„ฐ ์žฌ์‚ฌ์šฉ์˜ ํšจ์œจ์„ฑ์„ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ์‹ค์„ธ๊ณ„(Real-World) ํ™˜๊ฒฝ์œผ๋กœ์˜ ์ด์ „: ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ํ›ˆ๋ จ๋œ PCHands ์ •์ฑ…์€ sim-to-real adaptation ์—†์ด ์‹ค์ œ ๋กœ๋ด‡์—์„œ๋„ ๋น„๊ต์  ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. 4-finger manipulator์˜ ๊ฒฝ์šฐ, ๋ฌผ์ฒด occlusions๋กœ ์ธํ•œ vision-based object pose tracker์˜ ๋ถ€์ •ํ™•์„ฑ์œผ๋กœ ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ์žˆ์—ˆ์œผ๋‚˜, 2-finger manipulator์˜ ์„ฑ๋Šฅ์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฒฐ๊ณผ์™€ ์œ ์‚ฌํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ฒฐ๋ก 

PCHands๋Š” ์ธ๊ฐ„๊ณผ ๋กœ๋ด‡ ์กฐ์ž‘๊ธฐ ์ „๋ฐ˜์— ๊ฑธ์ณ ํ†ต์ผ๋œ ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์„ ์ถ”์ถœํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ADF, CVAE, PCA ๋ฐ ICP๋ฅผ ํ™œ์šฉํ•˜์—ฌ latent manipulator representation์„ ์ถ”์ถœํ•˜๊ณ , task ๋ฐ joint dimensionality๋ฅผ ์ค„์ด๋ฉฐ, end-effector frame์„ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” RL ๊ธฐ๋ฐ˜ ์กฐ์ž‘ ์ž‘์—…์—์„œ ํšจ์œจ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ , ๋‹ค์–‘ํ•œ ์†Œ์Šค์—์„œ ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒฌ๊ณ ํ•˜๊ฒŒ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. PCHands๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ํ›ˆ๋ จ๋œ ์ •์ฑ…์ด ์‹ค์ œ ์กฐ์ž‘๊ธฐ๋กœ ์ง์ ‘ ์ „์ด๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ์–ด, ํšจ๊ณผ์ ์ธ ๋ฐ์ดํ„ฐ ๋ฐ ์ •์ฑ… ์ „์ด๋ฅผ ํ†ตํ•œ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ๋กœ๋ด‡ ๋ชจ๋ธ ํ›ˆ๋ จ์˜ ๊ฐ€๋Šฅ์„ฑ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”” Ring Review

๐Ÿ”” Ring โ€” An idea that echoes. Grasp the core and its value.

1. ๋“ค์–ด๊ฐ€๋ฉฐ: ์™œ ์ด ์—ฐ๊ตฌ๊ฐ€ ์ค‘์š”ํ•œ๊ฐ€?

๋กœ๋ด‡ ๊ณตํ•™์—์„œ ๋ฐ์ดํ„ฐ๋Š” ์ƒˆ๋กœ์šด ์„์œ ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)์™€ ์ปดํ“จํ„ฐ ๋น„์ „(CV) ๋ถ„์•ผ์—์„œ๋Š” ์ˆ˜์‹ญ์–ต ๊ฐœ์˜ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ ํ™œ์šฉํ•œ ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ๋“ค์ด ๋†€๋ผ์šด ์„ฑ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. GPT-4๋Š” 1.5~4.5B๊ฐœ์˜ ํƒœ์Šคํฌ๋กœ ํ•™์Šต๋˜์—ˆ๊ณ , CLIP๊ณผ ๊ฐ™์€ ๋น„์ „ ๋ชจ๋ธ๋“ค์€ 5~18M๊ฐœ์˜ ์ด๋ฏธ์ง€-ํ…์ŠคํŠธ ์Œ์œผ๋กœ ํ›ˆ๋ จ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ๋กœ๋ด‡ ๊ณตํ•™์€ ์–ด๋–จ๊นŒ์š”? Open X-Embodiment์™€ ๊ฐ™์€ ๋Œ€๊ทœ๋ชจ ๋กœ๋ด‡ ๋ฐ์ดํ„ฐ์…‹๋„ ๊ณ ์ž‘ 0.16M๊ฐœ์˜ ํƒœ์Šคํฌ๋งŒ์„ ํฌํ•จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” NLP๋‚˜ CV ๋Œ€๋น„ ์ˆ˜์ฒœ ๋ฐฐ ์ ์€ ๊ทœ๋ชจ์ž…๋‹ˆ๋‹ค. ๋”์šฑ ์‹ฌ๊ฐํ•œ ๋ฌธ์ œ๋Š” ์ด๋Ÿฌํ•œ ๋ฐ์ดํ„ฐ์…‹์˜ ๋Œ€๋ถ€๋ถ„์ด ๋‹จ์ˆœํ•œ 2-ํ•‘๊ฑฐ ๊ทธ๋ฆฌํผ(two-finger gripper)๋ฅผ ์‚ฌ์šฉํ•œ ์ž‘์—…์— ๊ตญํ•œ๋˜์–ด ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ํ•ต์‹ฌ์ ์ธ ์งˆ๋ฌธ์ด ์ œ๊ธฐ๋ฉ๋‹ˆ๋‹ค: ์„œ๋กœ ๋‹ค๋ฅธ ํ˜•ํƒœ(morphology)์™€ ์ž์œ ๋„(DoF)๋ฅผ ๊ฐ€์ง„ ๋‹ค์–‘ํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋“ค์— ๋Œ€ํ•ด ํ†ตํ•ฉ๋œ ํ‘œํ˜„(unified representation)์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์„๊นŒ? ๋งŒ์•ฝ ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด, ์ธ๊ฐ„ ์† ๋ฐ์ดํ„ฐ, 5-ํ•‘๊ฑฐ ๋กœ๋ด‡ ์† ๋ฐ์ดํ„ฐ, ์‹ฌ์ง€์–ด ๋‹จ์ˆœํ•œ ๊ทธ๋ฆฌํผ ๋ฐ์ดํ„ฐ๊นŒ์ง€ ๋ชจ๋‘ ํ™œ์šฉํ•˜์—ฌ ๋ฒ”์šฉ์ ์ธ ์กฐ์ž‘(manipulation) ์ •์ฑ…์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ฐ”๋กœ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด IIT(Italian Institute of Technology)์˜ ์—ฐ๊ตฌํŒ€์ด ์ œ์•ˆํ•œ ๊ฒƒ์ด PCHands์ž…๋‹ˆ๋‹ค.


2. PCHands์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด

PCHands๋Š” โ€œPostural synergies(์ž์„ธ ์‹œ๋„ˆ์ง€)โ€๋ผ๋Š” ์‹ ๊ฒฝ๊ณผํ•™์  ๊ฐœ๋…์„ ๋กœ๋ด‡ ๊ณตํ•™์— ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ธ๊ฐ„์˜ ์†์€ ์•ฝ 20๊ฐœ ์ด์ƒ์˜ ์ž์œ ๋„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์ง€๋งŒ, ์‹ค์ œ๋กœ ๋Œ€๋ถ€๋ถ„์˜ ์ผ์ƒ์ ์ธ ํŒŒ์ง€(grasping) ๋™์ž‘์€ ์†Œ์ˆ˜์˜ โ€œ์‹œ๋„ˆ์ง€โ€ ํŒจํ„ด์˜ ์กฐํ•ฉ์œผ๋กœ ์„ค๋ช…๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ์•Œ๋ ค์ ธ ์žˆ์Šต๋‹ˆ๋‹ค.

PCHands๋Š” ์ด ์•„์ด๋””์–ด๋ฅผ ํ™•์žฅํ•˜์—ฌ, ์ธ๊ฐ„ ์†๋ถ€ํ„ฐ 5-ํ•‘๊ฑฐ ์ธ์ฒด๊ณตํ•™์  ๋กœ๋ด‡ ์†, 3-ํ•‘๊ฑฐ ๊ทธ๋ฆฌํผ, ์‹ฌ์ง€์–ด 2-ํ•‘๊ฑฐ ๋ณ‘๋ ฌ ๊ทธ๋ฆฌํผ๊นŒ์ง€ 17๊ฐ€์ง€ ์„œ๋กœ ๋‹ค๋ฅธ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด ํ†ตํ•ฉ๋œ ์ž์„ธ ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ๊ธฐ์—ฌ 3๊ฐ€์ง€

  1. ํ†ตํ•ฉ๋œ ๊ฐ€๋ณ€ ๊ธธ์ด ํ‘œํ˜„ ํ•™์Šต: CVAE์™€ PCA๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ณตํ†ต๋œ, ๊ทธ๋Ÿฌ๋ฉด์„œ๋„ ์œ ์—ฐํ•œ ์ฐจ์›์˜ ์ž ์žฌ ํ‘œํ˜„์„ ํ•™์Šต

  2. Anchor Description Format (ADF): 22๊ฐœ์˜ ์•ต์ปค ํฌ์ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„œ๋กœ ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ํ†ต์ผ๋œ ํ˜•์‹์œผ๋กœ ๊ธฐ์ˆ 

  3. ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์กฐ์ž‘ ์ž‘์—…์—์„œ์˜ ํšจ์šฉ์„ฑ ์ž…์ฆ: ๊ณต๋™ ๊ณต๊ฐ„(joint space)์—์„œ ํ•™์Šตํ•˜๋Š” ๊ธฐ์กด ๋ฐฉ๋ฒ• ๋Œ€๋น„ ๋” ๋น ๋ฅธ ์ˆ˜๋ ด๊ณผ ๋†’์€ ์ผ๊ด€์„ฑ ๋‹ฌ์„ฑ


3. ๊ธฐ์ˆ ์  ๋ฐฉ๋ฒ•๋ก  ์‹ฌ์ธต ๋ถ„์„

3.1 Anchor Description Format (ADF): ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ํ†ต์ผ๋œ ์–ธ์–ด

PCHands์˜ ์ฒซ ๋ฒˆ์งธ ํ•ต์‹ฌ ๊ตฌ์„ฑ์š”์†Œ๋Š” ADF์ž…๋‹ˆ๋‹ค. ์„œ๋กœ ๋‹ค๋ฅธ ๊ตฌ์กฐ์˜ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ๋น„๊ตํ•˜๋ ค๋ฉด ๊ณตํ†ต๋œ โ€œ์–ธ์–ดโ€๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ADF๋Š” 22๊ฐœ์˜ 3D ์•ต์ปค ํฌ์ธํŠธ \alpha = \{x_i | x_i \in \mathbb{R}^3\}_{i=1}^{22}๋ฅผ ์ •์˜ํ•˜์—ฌ ์ด ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

5-ํ•‘๊ฑฐ ์ธ์ฒด๊ณตํ•™์  ์†์˜ ๊ฒฝ์šฐ: - ๊ฐ ์†๊ฐ€๋ฝ๋‹น 4๊ฐœ์˜ ์•ต์ปค (๊ทผ์œ„, ์ค‘๊ฐ„, ์›์œ„, ๋ ๋งˆ๋””) - ์†๋ฐ”๋‹ฅ์— 2๊ฐœ์˜ ์•ต์ปค - ์ด 22๊ฐœ ์•ต์ปค

2-ํ•‘๊ฑฐ ๊ทธ๋ฆฌํผ์˜ ๊ฒฝ์šฐ: - ์™ผ์ชฝ jaw์— ์—„์ง€ ์•ต์ปค 4๊ฐœ ํ• ๋‹น - ์˜ค๋ฅธ์ชฝ jaw์— ๋‚˜๋จธ์ง€ 16๊ฐœ ์†๊ฐ€๋ฝ ์•ต์ปค ๋ณ‘ํ•ฉ - ์†๋ฐ”๋‹ฅ ์•ต์ปค๋Š” ๊ทธ๋ฆฌํผ ๋ฒ ์ด์Šค ์ค‘์•™์— ๋ฐฐ์น˜

์ด๋Ÿฌํ•œ โ€œ์•ต์ปค ๋ณ‘ํ•ฉ(anchor-merging)โ€ ์ ‘๊ทผ๋ฒ•์€ ์†๊ฐ€๋ฝ ์ˆ˜๊ฐ€ 5๊ฐœ ๋ฏธ๋งŒ์ธ ๋ชจ๋“  ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ์ผ๋ฐ˜ํ™”๋ฉ๋‹ˆ๋‹ค. ๊ฐ ์ƒ‰์ƒ์œผ๋กœ ๊ตฌ๋ถ„๋œ ์•ต์ปค๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์ „๋ฐ˜์— ๊ฑธ์ณ ์ผ๊ด€๋œ ๊ธฐ๋Šฅ์  ์˜๋ฏธ๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.

3.2 2๋‹จ๊ณ„ ์ฐจ์› ์ถ•์†Œ: CVAE + PCA

PCHands์˜ ํ•ต์‹ฌ ์•„ํ‚คํ…์ฒ˜๋Š” CVAE(Conditional Variational Auto-Encoder)์™€ PCA(Principal Component Analysis)์˜ ์ง๋ ฌ ์—ฐ๊ฒฐ์ž…๋‹ˆ๋‹ค.

CVAE ๋‹จ๊ณ„

CVAE๋Š” ์•ต์ปค ์œ„์น˜ \alpha๋ฅผ ์ž ์žฌ ๋ณ€์ˆ˜ z๋กœ ์ธ์ฝ”๋”ฉํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ \text{dim}(z) \ll 22 \times 3์ž…๋‹ˆ๋‹ค. ๋ชฉ์  ํ•จ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

\mathcal{L}_{\theta,\phi}(x,c) = \mathbb{E}_{z \sim q_\phi(z|x,c)}[\log p_\theta(x|z,c)] - \lambda D_{KL}(q_\phi(z|x,c) \| p_\theta(z))

์ค‘์š”ํ•œ ์ ์€ ์กฐ๊ฑด ๋ณ€์ˆ˜ c๊ฐ€ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์‹๋ณ„์ž์˜ one-hot ๋ฒกํ„ฐ๋ผ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด CVAE๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๋ชจ๋ธ๋งํ•˜๋ฉด์„œ, ์ž ์žฌ ๊ณต๊ฐ„ z์—์„œ๋Š” ์ž์„ธ ๋ณ€ํ™”์— ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์žฌ๊ตฌ์„ฑ ์†์‹ค์€ ๊ฐ€์ค‘ L1 ์†์‹ค์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค:

\min_{\phi,\theta} \sum_{i=1}^{22} |w_i(x_i - \hat{x}_i)|

๊ฐ€์ค‘์น˜ w_i๋Š” ์•ต์ปค ๋ณ‘ํ•ฉ ์‚ฌ์šฉ ๋นˆ๋„์— ๋”ฐ๋ผ ํœด๋ฆฌ์Šคํ‹ฑํ•˜๊ฒŒ ์„ค์ •๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์—„์ง€ ์•ต์ปค์ฒ˜๋Ÿผ ๋“œ๋ฌผ๊ฒŒ ๋ณ‘ํ•ฉ๋˜๋Š” ์•ต์ปค๋Š” ๋” ๋†’์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ฐ›์•„ ๊ท ํ˜• ์žกํžŒ ์žฌ๊ตฌ์„ฑ์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค.

PCA ๋‹จ๊ณ„์˜ ํ•„์š”์„ฑ

์—ฌ๊ธฐ์„œ ์ž์—ฐ์Šค๋Ÿฌ์šด ์งˆ๋ฌธ์ด ์ œ๊ธฐ๋ฉ๋‹ˆ๋‹ค: โ€œ์™œ CVAE๋งŒ์œผ๋กœ ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์€๊ฐ€?โ€

์ €์ž๋“ค์€ vanilla PCA๋ฅผ ์ง์ ‘ ์•ต์ปค ๊ณต๊ฐ„์— ์ ์šฉํ•  ๋•Œ์˜ ํ•œ๊ณ„๋ฅผ ๋ช…ํ™•ํžˆ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๊ทธ๋ฆผ 3์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด, ์ง์ ‘ PCA๋ฅผ ์ ์šฉํ•˜๋ฉด ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์ด ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ฐ„์˜ ํ˜•ํƒœํ•™์  ์ฐจ์ด๋ฅผ ๊ณผ๋„ํ•˜๊ฒŒ ํ‘œํ˜„ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ์— ๋”ฐ๋ผ ์ž์„ธ ์ •๋ณด๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ๋Šฅ๋ ฅ์ด ์ƒ์‹ค๋ฉ๋‹ˆ๋‹ค.

๋ฐ˜๋ฉด, CVAE๊ฐ€ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ฐ„ ๋ณ€๋™์„ ๋ชจ๋ธ๋งํ•˜๋ฉด PCA๋Š” ์ž์„ธ ๋ณ€๋™์— ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๋ฐ”๋กœ 2๋‹จ๊ณ„ ์ ‘๊ทผ๋ฒ•์˜ ํ•ต์‹ฌ ํ†ต์ฐฐ์ž…๋‹ˆ๋‹ค.

PCA๋ฅผ ํ†ตํ•ด ์ž ์žฌ ๋ณ€์ˆ˜ z๋ฅผ ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜ z'๋กœ ๋ณ€ํ™˜ํ•จ์œผ๋กœ์จ: - ์ €์ž์œ ๋„ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ(์˜ˆ: ๋Œ€๋ถ€๋ถ„์˜ 2-ํ•‘๊ฑฐ ๊ทธ๋ฆฌํผ๋Š” 1 DoF๋งŒ ๊ฐ€์ง)์˜ ์ค‘๋ณต์„ฑ ๋ฐฉ์ง€ - ๋‹ค์šด์ŠคํŠธ๋ฆผ ํƒœ์Šคํฌ์˜ ๋ณต์žก๋„์— ๋”ฐ๋ฅธ ์œ ์—ฐํ•œ ์ฐจ์› ์„ ํƒ ๊ฐ€๋Šฅ

3.3 ์—”๋“œ์ดํŽ™ํ„ฐ ํ”„๋ ˆ์ž„ ์ •์ œ: ICP๋ฅผ ํ™œ์šฉํ•œ ๋ฐ˜๋ณต์  ์ •๋ ฌ

ํ˜•ํƒœํ•™์  ์ฐจ์ด๋กœ ์ธํ•ด ์ดˆ๊ธฐ ์—”๋“œ์ดํŽ™ํ„ฐ ํ”„๋ ˆ์ž„ ๋ฐฐ์น˜๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ฐ„์— ์ผ๊ด€์„ฑ์ด ์—†์Šต๋‹ˆ๋‹ค. PCHands๋Š” Iterative Closest Point (ICP) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ™œ์šฉํ•œ ๋ฐ˜๋ณต์  ํ•™์Šต ์ ˆ์ฐจ๋กœ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค.

์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฐœ์š”

์ž…๋ ฅ: ADF ํ˜•์‹์˜ M๊ฐœ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ
์ถœ๋ ฅ: ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ ฯˆ, ํ”„๋ ˆ์ž„ ์ •๋ ฌ ฮด

1. ฮดโ‚€ โ† 0  // ์ดˆ๊ธฐ ์ •๋ ฌ์€ identity
2. while i โ‰ค budget do
3.     A_i โ† create_dataset(ฮด_i)  // ํ˜„์žฌ ์ •๋ ฌ๋กœ ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ
4.     ฯˆ_i โ† train_model(A_i)     // CVAE+PCA ํ•™์Šต
5.     foreach M do
6.         ฮด_{i+1} โ† refine_frame(ฯˆ_i)  // ICP๋กœ ํ”„๋ ˆ์ž„ ์ •์ œ
7. return ฯˆ, ฮด

์•ต์ปค ์ •๋ ฌ ์ƒ์„ธ

ํ”„๋ ˆ์ž„ ์ •์ œ๋ฅผ ์œ„ํ•ด 4๊ฐœ์˜ ์ฐธ์กฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ(Robotiq-2f85, Google-gripper, Kinova-3f, Armar-hand)๋ฅผ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค. ์„ ํƒ ๊ธฐ์ค€์€ ๋‹จ์ˆœ์„ฑ๊ณผ ํ˜•ํƒœํ•™์  ๋‹ค์–‘์„ฑ์ž…๋‹ˆ๋‹ค.

์ •์ œ ๊ณผ์ •: 1. ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์„ ๋”ฐ๋ผ k๊ฐœ์˜ ๋“ฑ๊ฐ„๊ฒฉ ์ ์„ ์ƒ˜ํ”Œ๋ง 2. ๊ฐ ์ ์—์„œ decode-pass๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํƒ€๊ฒŸ ๋ฐ ์ฐธ์กฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ์•ต์ปค ๊ณ„์‚ฐ 3. ์ฐธ์กฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋“ค์˜ ํ‰๊ท  ์•ต์ปค ์œ„์น˜ ๊ณ„์‚ฐ 4. ๊ฐ€์ค‘ SVD๋ฅผ ์‚ฌ์šฉํ•œ ICP๋กœ ์ตœ์  ๊ฐ•์ฒด ๋ณ€ํ™˜ \delta = \{R, t\} \in SE(3) ๊ณ„์‚ฐ

\min_{R, t} \sum_{i=1}^{k} \|\alpha_i^{\text{ref}} - R\alpha_i^{\text{tgt}} - t\|^2

์†๋๊ณผ ์—„์ง€ ์•ต์ปค์— ๋” ๋†’์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ๊ท ํ˜• ์žกํžŒ ์ •๋ ฌ์„ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

3.4 ์ธ์ฝ”๋“œ-๋””์ฝ”๋“œ ํŒจ์Šค: ์‹ค์ œ ์‚ฌ์šฉ๋ฒ•

์ธ์ฝ”๋“œ ํŒจ์Šค \mathcal{E}: j \mapsto \alpha \mapsto z \mapsto z' 1. ์ˆœ์šด๋™ํ•™์œผ๋กœ ๊ด€์ ˆ ๊ฐ’ j๋ฅผ ์•ต์ปค ์œ„์น˜ \alpha๋กœ ๋ณ€ํ™˜ 2. CVAE ์ธ์ฝ”๋”๋กœ \alpha๋ฅผ ์ž ์žฌ ํ‘œํ˜„ z๋กœ ์ธ์ฝ”๋”ฉ 3. PCA๋กœ z๋ฅผ ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜ z'๋กœ ๋ณ€ํ™˜

๋””์ฝ”๋“œ ํŒจ์Šค \mathcal{D}: z' \mapsto z \mapsto \alpha \mapsto j 1. ์—ญ PCA๋กœ z'๋ฅผ z๋กœ ๋ณต์› 2. CVAE ๋””์ฝ”๋”๋กœ z๋ฅผ ์•ต์ปค ์œ„์น˜ \alpha๋กœ ์žฌ๊ตฌ์„ฑ 3. ๋‹ค๋ชฉ์  ์—ญ์šด๋™ํ•™์œผ๋กœ \alpha๋ฅผ ๊ด€์ ˆ ๊ฐ’ j๋กœ ๋ณ€ํ™˜

์ž์„ธ ๋ฆฌํƒ€๊ฒŒํŒ…: ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ \gamma์—์„œ \nu๋กœ์˜ ์ž์„ธ ์ „๋‹ฌ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค:

j_\nu = \mathcal{D}(\nu, \mathcal{E}(\gamma, j_\gamma))

์ด ์ ‘๊ทผ๋ฒ•์˜ ์•„๋ฆ„๋‹ค์›€์€ ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ์ด ํ•˜๋“œ์›จ์–ด ๊ณ„์ธต๊ณผ ๋ถ„๋ฆฌ๋œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ธ์ฝ”๋”ฉ๊ณผ ๋””์ฝ”๋”ฉ ์‹œ ๊ฐ๊ฐ์˜ ํ•˜๋“œ์›จ์–ด ๊ณ„์ธต๋งŒ ๊ต์ฒดํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.


4. ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ถ„์„

4.1 ์‹คํ—˜ ์„ค์ •

  • ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ: 17์ข… (Robotiq, WidowX, Fetch, xArm, WSG50, Rethink, Kinova2F, GoogleBot, Kinova3F, Franka, Armar, ergoCub, Schunk, Allegro, Shadow, LEAP, MANO)
  • ๋ฐ์ดํ„ฐ์…‹: ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋‹น 10,000๊ฐœ ๊ตฌ์„ฑ ์ƒ˜ํ”Œ (์ด 170,000๊ฐœ)
  • CVAE ๊ตฌ์กฐ: 4-layer MLP + Layer Normalization, \text{dim}(z) = 10

4.2 ์ •์„ฑ์  ๋ถ„์„: ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์˜ ์˜๋ฏธ

๋†€๋ผ์šด ๋ฐœ๊ฒฌ์€ ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์ด ๋ชจ๋“  17๊ฐœ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์—์„œ โ€œ์† ์—ด๊ธฐ/๋‹ซ๊ธฐโ€ ๋™์ž‘์„ ์ผ๊ด€๋˜๊ฒŒ ํ‘œํ˜„ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ฆผ 4์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋“ฏ์ด: - 1^{\text{st}}\text{pc} = 3: ์™„์ „ํžˆ ์—ด๋ฆฐ ๊ตฌ์„ฑ - 1^{\text{st}}\text{pc} = -3: ์™„์ „ํžˆ ๋‹ซํžŒ ๊ตฌ์„ฑ

์ด๋Š” 2-ํ•‘๊ฑฐ ๊ทธ๋ฆฌํผ๋ถ€ํ„ฐ 5-ํ•‘๊ฑฐ ์ธ์ฒด๊ณตํ•™์  ์†, ์‹ฌ์ง€์–ด ๋น„๊ฐ•์ฒด(non-rigid) MANO ์† ๋ชจ๋ธ๊นŒ์ง€ ์ผ๊ด€๋˜๊ฒŒ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ PCHands๊ฐ€ ๋‹จ์ˆœํ•œ ์ฐจ์› ์ถ•์†Œ๋ฅผ ๋„˜์–ด ์˜๋ฏธ๋ก ์ ์œผ๋กœ ์ผ๊ด€๋œ ํ‘œํ˜„์„ ํ•™์Šตํ–ˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

4.3 ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์กฐ์ž‘ ํƒœ์Šคํฌ

๋ฒค์น˜๋งˆํฌ ์„ค์ •

  • ํƒœ์Šคํฌ: Open-Door, Relocate-Mustard, Relocate-MeatCan, Relocate-SoupCan, Flip-Mug (5๊ฐ€์ง€)
  • ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ: Allegro (16 DoF), Schunk (9 DoF), Shadow (18 DoF) (3๊ฐ€์ง€)
  • RL ์•Œ๊ณ ๋ฆฌ์ฆ˜: TRPO (๋ฐ๋ชจ ์—†์Œ), DAPG (๋ฐ๋ชจ ์‚ฌ์šฉ)
  • ๋น„๊ต ๋Œ€์ƒ: ๊ด€์ ˆ ๊ณต๊ฐ„์—์„œ ํ•™์Šตํ•˜๋Š” ์ตœ์‹  baseline [Qin et al., 2022]

ํ•™์Šต ๊ณก์„  ๋ถ„์„

๊ทธ๋ฆผ 5์˜ ๊ฒฐ๊ณผ๋Š” ์ธ์ƒ์ ์ž…๋‹ˆ๋‹ค:

  1. ๋น ๋ฅธ ์ˆ˜๋ ด: PCHands๋Š” ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ์—์„œ baseline๋ณด๋‹ค ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ด
  2. DAPG ์šฐ์›”์„ฑ: ์ธ๊ฐ„ ๋ฐ๋ชจ๊ฐ€ ๋กœ์ปฌ ์˜ตํ‹ฐ๋งˆ ํšŒํ”ผ์— ๋„์›€
  3. ์ €์ฐจ์› ํ‘œํ˜„์˜ ํšจ๊ณผ: 1-pc์™€ 2-pc๊ฐ€ 4-pc๋ณด๋‹ค ๋Œ€์ฒด๋กœ ์šฐ์ˆ˜

ํŠนํžˆ ๋งˆ์ง€๋ง‰ ๋ฐœ๊ฒฌ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. 16, 9, 18 DoF๋ฅผ ๊ฐ€์ง„ ๋ณต์žกํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋“ค๋„ ๋‹จ 2๊ฐœ์˜ ์ฃผ์„ฑ๋ถ„๋งŒ์œผ๋กœ ์ถฉ๋ถ„ํžˆ ํ•™์Šต ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋Š” โ€œcurse of dimensionalityโ€๋ฅผ ๊ทน๋ณตํ•˜๋Š” ์‹ค์งˆ์ ์ธ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

4.4 ๋ฐ๋ชจ ์†Œ์Šค ๊ต์ฐจ ์‹คํ—˜: ์ง„์ •ํ•œ ์ „์ด ๊ฐ€๋Šฅ์„ฑ

์ด ์‹คํ—˜์ด ์•„๋งˆ๋„ ๊ฐ€์žฅ ํฅ๋ฏธ๋กœ์šด ๋ถ€๋ถ„์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์—ฐ๊ตฌํŒ€์€ ๋‹ค์Œ ์งˆ๋ฌธ์— ๋‹ตํ•ฉ๋‹ˆ๋‹ค: โ€œ๋‹ค๋ฅธ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋กœ ์ˆ˜์ง‘ํ•œ ๋ฐ๋ชจ๋กœ๋„ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•œ๊ฐ€?โ€

์‹คํ—˜ ์„ค์ •: - ๋ฐ๋ชจ ์†Œ์Šค: 2F (Robotiq-2f85), 3F (Kinova-3f), 4F (LEAP-hand) - ํƒ€๊ฒŸ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ: ์œ„์™€ ๋™์ผํ•œ 3๊ฐ€์ง€ - ์ด 9๊ฐ€์ง€ ์กฐํ•ฉ (๋™์ผ ์†Œ์Šค-ํƒ€๊ฒŸ ํฌํ•จ)

๊ฒฐ๊ณผ (๊ทธ๋ฆผ 6): - ์˜ˆ์ƒ๋Œ€๋กœ ๋™์ผ ์†Œ์Šค-ํƒ€๊ฒŸ ์กฐํ•ฉ์—์„œ ์ตœ๊ณ  ์„ฑ๋Šฅ - ํ•˜์ง€๋งŒ ๋‹ค๋ฅธ ์†Œ์Šค์˜ ๋ฐ๋ชจ๋กœ๋„ TRPO ๋Œ€๋น„ ์ผ๊ด€๋˜๊ฒŒ ๋†’์€ ์„ฑ๋Šฅ - ํŠนํžˆ 4F ํƒ€๊ฒŸ์—์„œ ์ด ํšจ๊ณผ๊ฐ€ ๋‘๋“œ๋Ÿฌ์ง

์ด๊ฒƒ์ด ์™œ ์ค‘์š”ํ•œ๊ฐ€์š”? ์‹ค์ œ ๋กœ๋ด‡ ์—ฐ๊ตฌ์—์„œ ๋ฐ๋ชจ ์ˆ˜์ง‘์€ ๋น„์šฉ์ด ๋งŽ์ด ๋“ญ๋‹ˆ๋‹ค. PCHands๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ด๋ฏธ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์–ด๋–ค ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ๋ฐ๋ชจ๋กœ๋„ ์ƒˆ๋กœ์šด ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

4.5 ์‹ค์„ธ๊ณ„ ์ „์ด ์‹คํ—˜

์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ํ•™์Šตํ•œ ์ •์ฑ…์„ ์‹ค์ œ ๋กœ๋ด‡์— ๋ฐฐํฌํ•˜๋Š” ๊ฒƒ์€ ๋กœ๋ด‡ ํ•™์Šต์˜ ์„ฑ๋ฐฐ(holy grail)์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. PCHands๋Š” ์ด์— ๋Œ€ํ•ด์„œ๋„ ์œ ๋งํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์‹คํ—˜ ํ”Œ๋žซํผ

  • 7-DoF Franka-Panda ๋กœ๋ด‡ ํŒ”
  • Robotiq-2f85 (2F) ๋˜๋Š” LEAP-hand (4F)
  • RealSense L515 ์™ธ๋ถ€ RGB-D ์นด๋ฉ”๋ผ
  • FoundationPose๋ฅผ ํ†ตํ•œ 6D ๋ฌผ์ฒด ์ž์„ธ ์ถ”์ 

๊ฒฐ๊ณผ

ํƒœ์Šคํฌ ํƒ€๊ฒŸ 2F ๋ฐ๋ชจ 3F ๋ฐ๋ชจ 4F ๋ฐ๋ชจ ํ‰๊ท 
Relocate-Mustard 2F 90% 100% 100% 97%
4F 100% 80% 90% 90%
Relocate-MeatCan 2F 100% 80% 90% 90%
4F 50% 30% 70% 50%
Relocate-SoupCan 2F 80% 80% 70% 77%
4F 70% 50% 0% 40%

์ฃผ๋ชฉํ•  ์ : 1. 2F ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋Š” ๋ชจ๋“  ํƒœ์Šคํฌ์—์„œ ๋†’์€ ์„ฑ๊ณต๋ฅ  ์œ ์ง€ 2. 4F ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋Š” SoupCan์—์„œ ์„ฑ๋Šฅ ์ €ํ•˜

์„ฑ๋Šฅ ์ €ํ•˜์˜ ์›์ธ์€ ํฅ๋ฏธ๋กญ์Šต๋‹ˆ๋‹ค: 4-ํ•‘๊ฑฐ encompassing ํŒŒ์ง€๋Š” ๋ฌผ์ฒด๋ฅผ ์‹ฌํ•˜๊ฒŒ ๊ฐ€๋ฆฌ๊ธฐ ๋•Œ๋ฌธ์— ๋น„์ „ ๊ธฐ๋ฐ˜ ์ž์„ธ ์ถ”์ ๊ธฐ(FoundationPose)์˜ ์ •ํ™•๋„๊ฐ€ ๋–จ์–ด์ง‘๋‹ˆ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ๋Š” ground-truth ์ž์„ธ๊ฐ€ ํ•ญ์ƒ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด๋Š” sim-to-real gap์˜ ์ƒˆ๋กœ์šด ์›์ธ์„ ์‹๋ณ„ํ•œ ๊ฒƒ์œผ๋กœ, ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.


5. ๊ฐ•์ ๊ณผ ํ•œ๊ณ„ ๋ถ„์„

๊ฐ•์ 

1. ์‹ค์šฉ์ ์ธ ์ฐจ์› ์ถ•์†Œ

๊ธฐ์กด์˜ joint space ํ•™์Šต์€ ๊ณ ์ฐจ์› ํ–‰๋™ ๊ณต๊ฐ„์˜ ์ €์ฃผ์— ์‹œ๋‹ฌ๋ฆฝ๋‹ˆ๋‹ค. PCHands๋Š” 2-pc๋งŒ์œผ๋กœ๋„ ์ถฉ๋ถ„ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ์ด๋Š” RL ์ƒ˜ํ”Œ ํšจ์œจ์„ฑ์„ ํฌ๊ฒŒ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.

2. ํ”Œ๋Ÿฌ๊ทธ ์•ค ํ”Œ๋ ˆ์ด ์•„ํ‚คํ…์ฒ˜

์‹œ๋„ˆ์ง€ ๋ชจ๋ธ๊ณผ ํ•˜๋“œ์›จ์–ด ๊ณ„์ธต์˜ ๋ถ„๋ฆฌ๋Š” ์ƒˆ๋กœ์šด ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์ถ”๊ฐ€๋ฅผ ์šฉ์ดํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ์ƒˆ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด: 1. 22๊ฐœ ์•ต์ปค ์œ„์น˜ ์ •์˜ 2. ์ˆœ/์—ญ์šด๋™ํ•™ ํ•จ์ˆ˜ ๊ตฌํ˜„ 3. ๊ธฐ์กด ๋ชจ๋ธ์— fine-tuning (๋˜๋Š” ์ฒ˜์Œ๋ถ€ํ„ฐ ์žฌํ•™์Šต)

3. ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ

๋‹ค๋ฅธ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ๋ฐ๋ชจ๋ฅผ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์€ ์‹ค์งˆ์ ์ธ ์ด์ ์ž…๋‹ˆ๋‹ค. ๊ฐ’๋น„์‹ผ ๋กœ๋ด‡ ๋ฐ๋ชจ ์ˆ˜์ง‘ ๋น„์šฉ์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

4. ํ•ด์„ ๊ฐ€๋Šฅ์„ฑ

์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์ด โ€œ์† ์—ด๊ธฐ/๋‹ซ๊ธฐโ€๋ฅผ ์˜๋ฏธํ•œ๋‹ค๋Š” ๊ฒƒ์€ ๋‹จ์ˆœํžˆ ํฅ๋ฏธ๋กœ์šด ๋ฐœ๊ฒฌ์ด ์•„๋‹™๋‹ˆ๋‹ค. ์ด๋Š” ํ•™์Šต๋œ ํ‘œํ˜„์ด ์˜๋ฏธ๋ก ์ ์œผ๋กœ ์˜๋ฏธ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ๋””๋ฒ„๊น…๊ณผ ์ •์ฑ… ๋ถ„์„์— ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.

ํ•œ๊ณ„

1. ์•ต์ปค ๋ฐฐ์น˜์˜ ์ˆ˜๋™์„ฑ

22๊ฐœ ์•ต์ปค์˜ ์œ„์น˜๋Š” ๊ฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด ์ˆ˜๋™์œผ๋กœ ์ •์˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ƒˆ๋กœ์šด ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•  ๋•Œ๋งˆ๋‹ค ์ „๋ฌธ๊ฐ€ ์ง€์‹๊ณผ ์‹œ๊ฐ„์ด ํ•„์š”ํ•จ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

2. ์ฐธ์กฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์„ ํƒ์˜ ํœด๋ฆฌ์Šคํ‹ฑ

ICP ์ •๋ ฌ์„ ์œ„ํ•œ ์ฐธ์กฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์„ ํƒ์€ โ€œ๋‹จ์ˆœ์„ฑ๊ณผ ํ˜•ํƒœํ•™์  ๋‹ค์–‘์„ฑโ€์ด๋ผ๋Š” ํœด๋ฆฌ์Šคํ‹ฑ์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์ตœ์ ์˜ ์„ ํƒ์ธ์ง€๋Š” ๋ช…ํ™•ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

3. ๋น„์ „ ๊ธฐ๋ฐ˜ ์ž์„ธ ์ถ”์ • ์˜์กด์„ฑ

์‹ค์„ธ๊ณ„ ์‹คํ—˜์—์„œ ๋“œ๋Ÿฌ๋‚ฌ๋“ฏ์ด, ๊ฐ€๋ ค์ง(occlusion)์— ๋ฏผ๊ฐํ•œ ๋น„์ „ ๊ธฐ๋ฐ˜ ์ž์„ธ ์ถ”์ ์€ ๋ณ‘๋ชฉ์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” PCHands ์ž์ฒด์˜ ํ•œ๊ณ„๋ผ๊ธฐ๋ณด๋‹ค๋Š” ์‹œ์Šคํ…œ ํ†ตํ•ฉ์˜ ๋ฌธ์ œ์ด์ง€๋งŒ, ์‹ค์ œ ๋ฐฐํฌ ์‹œ ๊ณ ๋ คํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

4. ์ด‰๊ฐ ๋ฐ ํž˜ ์ •๋ณด ๋ถ€์žฌ

ํ˜„์žฌ PCHands๋Š” ์ˆœ์ˆ˜ํ•˜๊ฒŒ ๊ธฐํ•˜ํ•™์  ํ‘œํ˜„์ž…๋‹ˆ๋‹ค. ์ด‰๊ฐ ์„ผ์„œ ์ •๋ณด๋‚˜ ํŒŒ์ง€ ํž˜๊ณผ ๊ฐ™์€ ์ค‘์š”ํ•œ ์กฐ์ž‘ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ๋Š” ํฌํ•จ๋˜์–ด ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

5. ํƒœ์Šคํฌ ๋ฒ”์œ„์˜ ์ œํ•œ

์‹คํ—˜๋œ ํƒœ์Šคํฌ๋“ค(Open-Door, Relocate, Flip)์€ ์ƒ๋Œ€์ ์œผ๋กœ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. In-hand manipulation์ด๋‚˜ tool use์™€ ๊ฐ™์€ ๋” ๋ณต์žกํ•œ dexterous manipulation ํƒœ์Šคํฌ์—์„œ์˜ ํšจ๊ณผ๋Š” ๊ฒ€์ฆ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.


6. ๊ธฐ์กด ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต ๋ฐ ์œ„์น˜

์ž์„ธ ์‹œ๋„ˆ์ง€ ์—ฐ๊ตฌ์˜ ๊ณ„๋ณด

PCHands๋Š” ๋กœ๋ด‡ ์† ์ž์„ธ ์‹œ๋„ˆ์ง€ ์—ฐ๊ตฌ์˜ ์˜ค๋žœ ์—ญ์‚ฌ ์œ„์— ์„œ ์žˆ์Šต๋‹ˆ๋‹ค:

  1. PCA ๊ธฐ๋ฐ˜ ์ ‘๊ทผ [Ciocarlie 2007, Bernardino 2013]: ์ง์ ‘ ๊ด€์ ˆ ๊ณต๊ฐ„์— PCA ์ ์šฉ
  2. GPLVM ๊ธฐ๋ฐ˜ [Xu 2016]: ๋น„์„ ํ˜• ํ™•๋ฅ  ๋ชจ๋ธ๋กœ ์žฌ๊ตฌ์„ฑ ์˜ค์ฐจ ๊ฐœ์„ 
  3. AE/CVAE ๊ธฐ๋ฐ˜ [Starke 2018, Dimou 2023]: ๋”ฅ๋Ÿฌ๋‹์œผ๋กœ ๋” ํ’๋ถ€ํ•œ ์ž ์žฌ ๊ณต๊ฐ„ ํ•™์Šต

PCHands์˜ ์ฐจ๋ณ„์ ์€ ๋‹ค์ค‘ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•œ ํ†ตํ•ฉ ํ‘œํ˜„์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์€ ๋Œ€๋ถ€๋ถ„ ๋‹จ์ผ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ์ดˆ์ ์„ ๋งž์ถ”์—ˆ์Šต๋‹ˆ๋‹ค.

๋ฆฌํƒ€๊ฒŒํŒ… ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต

์† ์ž์„ธ ๋ฆฌํƒ€๊ฒŒํŒ… ๋ถ„์•ผ์—์„œ [Qin et al., 2022]์˜ AnyTeleop์€ ํ˜„์žฌ state-of-the-art์ž…๋‹ˆ๋‹ค. PCHands์™€์˜ ์ฃผ์š” ์ฐจ์ด์ :

์ธก๋ฉด AnyTeleop PCHands
๋ฆฌํƒ€๊ฒŒํŒ… ๋ฐฉ์‹ ์ตœ์ ํ™” ๊ธฐ๋ฐ˜ (๋งค ํ”„๋ ˆ์ž„) ์ˆœ์ „ํŒŒ๋งŒ ์‚ฌ์šฉ
๊ณ„์‚ฐ ๋น„์šฉ ๋†’์Œ ๋‚ฎ์Œ
์‹ค์‹œ๊ฐ„์„ฑ ์ œํ•œ์  ์‹ค์‹œ๊ฐ„ ๊ฐ€๋Šฅ
๊ณตํ†ต ํ‘œํ˜„ ์—†์Œ (๊ด€์ ˆ ๊ณต๊ฐ„) ์žˆ์Œ (์‹œ๋„ˆ์ง€ ๊ณต๊ฐ„)

PCHands์˜ ํ•ต์‹ฌ ์ด์ ์€ ์ตœ์ ํ™” ์—†์ด ์ˆœ์ „ํŒŒ๋งŒ์œผ๋กœ ๋ฆฌํƒ€๊ฒŒํŒ…์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์‹ค์‹œ๊ฐ„ ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜์— ์ค‘์š”ํ•œ ์žฅ์ ์ž…๋‹ˆ๋‹ค.


7. ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ

์ €์ž๋“ค์ด ์ œ์‹œํ•œ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ๊ณผ ํ•จ๊ป˜ ์ œ๊ฐ€ ์ƒ๊ฐํ•˜๋Š” ์ถ”๊ฐ€์ ์ธ ๋ฐฉํ–ฅ์„ ๋…ผ์˜ํ•ฉ๋‹ˆ๋‹ค.

์ €์ž๋“ค์˜ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ

  1. ๋” ํฐ/๊ฐœ๋ฐฉํ˜• ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์ง‘ํ•ฉ์œผ๋กœ ํ™•์žฅ: ํ˜„์žฌ 17๊ฐœ์—์„œ ์ˆ˜์‹ญ, ์ˆ˜๋ฐฑ ๊ฐœ๋กœ
  2. ๋Œ€๊ทœ๋ชจ ๊ณต๊ฐœ ๋ฐ์ดํ„ฐ์…‹ ํ™œ์šฉ: Open X-Embodiment์˜ 2-ํ•‘๊ฑฐ ๊ทธ๋ฆฌํผ ๋ฐ์ดํ„ฐ๋‚˜ ์ธ๊ฐ„ ๋ฐ๋ชจ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ ๋ณต์žกํ•œ dexterous ํƒœ์Šคํฌ ํ•™์Šต

์ถ”๊ฐ€๋กœ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋Š” ์—ฐ๊ตฌ ๋ฐฉํ–ฅ

1. ์ž๋™ํ™”๋œ ์•ต์ปค ๋ฐฐ์น˜

๋ฉ”์‰ฌ ๋ถ„์„์ด๋‚˜ ๊ธฐ๋Šฅ์  ์œ ์‚ฌ์„ฑ ํ•™์Šต์„ ํ†ตํ•ด ์•ต์ปค ์œ„์น˜๋ฅผ ์ž๋™์œผ๋กœ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ• ์—ฐ๊ตฌ. ์ด๋Š” ์ƒˆ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ํ†ตํ•ฉ์˜ ์ง„์ž…์žฅ๋ฒฝ์„ ๋‚ฎ์ถœ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

2. ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํ™•์žฅ

์ด‰๊ฐ, ํž˜, ๊ณ ์œ ์ˆ˜์šฉ๊ฐ๊ฐ ์ •๋ณด๋ฅผ ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์— ํ†ตํ•ฉ. ์ด๋Š” ๋” ์„ฌ์„ธํ•œ ์กฐ์ž‘ ํƒœ์Šคํฌ์— ํ•„์ˆ˜์ ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

3. ๊ณ„์ธต์  ์‹œ๋„ˆ์ง€ ํ‘œํ˜„

ํ˜„์žฌ์˜ ๋‹จ์ผ ๋ ˆ๋ฒจ PCA ๋Œ€์‹ , ํƒœ์Šคํฌ๋ณ„ ๋˜๋Š” ํŒŒ์ง€ ์œ ํ˜•๋ณ„ ๊ณ„์ธต์  ์‹œ๋„ˆ์ง€ ๊ตฌ์กฐ ํ•™์Šต. ์ด๋Š” ๋” ๋ณต์žกํ•œ ํ–‰๋™์˜ ๊ตฌ์„ฑ์„ฑ(compositionality)์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

4. ์˜จ๋ผ์ธ ์ ์‘

์ƒˆ๋กœ์šด ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด ์†Œ๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋กœ ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•˜๋Š” few-shot ๋˜๋Š” meta-learning ์ ‘๊ทผ๋ฒ• ์—ฐ๊ตฌ.

5. ์‹ค์„ธ๊ณ„ ๊ฐ€๋ ค์ง ๋ฌธ์ œ ํ•ด๊ฒฐ

์ด‰๊ฐ ํ”ผ๋“œ๋ฐฑ์ด๋‚˜ ๊ฐ•์ธํ•œ ์ƒํƒœ ์ถ”์ • ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ๋น„์ „ ๊ธฐ๋ฐ˜ ์ถ”์ ์˜ ๊ฐ€๋ ค์ง ๋ฌธ์ œ ํ•ด๊ฒฐ.


8. ์‹ค๋ฌด์  ์‹œ์‚ฌ์ : ์–ธ์ œ PCHands๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š”๊ฐ€?

์‚ฌ์šฉ์„ ๊ถŒ์žฅํ•˜๋Š” ๊ฒฝ์šฐ

  1. ๋‹ค์–‘ํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋กœ ๋ฐ๋ชจ๋ฅผ ์ˆ˜์ง‘ํ•ด์•ผ ํ•  ๋•Œ: ์—ฐ๊ตฌ์‹ค์— ์—ฌ๋Ÿฌ ์ข…๋ฅ˜์˜ ๋กœ๋ด‡ ์†์ด ์žˆ๋‹ค๋ฉด, ์–ด๋–ค ๊ฒƒ์œผ๋กœ ์ˆ˜์ง‘ํ•œ ๋ฐ๋ชจ๋“  ํ™œ์šฉ ๊ฐ€๋Šฅ

  2. RL ์ƒ˜ํ”Œ ํšจ์œจ์„ฑ์ด ์ค‘์š”ํ•  ๋•Œ: ๊ณ ์ฐจ์› ํ–‰๋™ ๊ณต๊ฐ„์˜ ์ €์ฃผ๋ฅผ ํ”ผํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด PCHands์˜ ์ €์ฐจ์› ํ‘œํ˜„์ด ๋„์›€

  3. ์‹ค์‹œ๊ฐ„ ๋ฆฌํƒ€๊ฒŒํŒ…์ด ํ•„์š”ํ•  ๋•Œ: ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜ ์‹œ์Šคํ…œ์—์„œ ์ตœ์ ํ™” ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์€ ์ง€์—ฐ์„ ์œ ๋ฐœ. PCHands๋Š” ์ˆœ์ „ํŒŒ๋งŒ์œผ๋กœ ์‹ค์‹œ๊ฐ„ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ

  4. ํ•ด์„ ๊ฐ€๋Šฅํ•œ ์ •์ฑ…์ด ํ•„์š”ํ•  ๋•Œ: ํ–‰๋™ ๊ณต๊ฐ„์ด ์˜๋ฏธ๋ก ์ ์œผ๋กœ ํ•ด์„ ๊ฐ€๋Šฅ(์˜ˆ: PC1 = ์† ์—ด๊ธฐ/๋‹ซ๊ธฐ)ํ•˜๋ฉด ๋””๋ฒ„๊น…๊ณผ ๋ถ„์„์ด ์šฉ์ด

์‚ฌ์šฉ์ด ์ ํ•ฉํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋Š” ๊ฒฝ์šฐ

  1. ๋‹จ์ผ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋งŒ ์‚ฌ์šฉํ•  ๋•Œ: ์ „์ด ๊ฐ€๋Šฅ์„ฑ์ด ๋ถˆํ•„์š”ํ•˜๋‹ค๋ฉด ๋” ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์ด ์ ํ•ฉํ•  ์ˆ˜ ์žˆ์Œ

  2. ๋งค์šฐ ์„ธ๋ฐ€ํ•œ ์†๊ฐ€๋ฝ ์ œ์–ด๊ฐ€ ํ•„์š”ํ•  ๋•Œ: 2-pc ํ‘œํ˜„์€ ๊ฐœ๋ณ„ ์†๊ฐ€๋ฝ์˜ ๋…๋ฆฝ์  ์ œ์–ด์— ์ œํ•œ

  3. ์ƒˆ๋กœ์šด ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ์ž์ฃผ ์ถ”๊ฐ€ํ•ด์•ผ ํ•  ๋•Œ: ์ˆ˜๋™ ์•ต์ปค ๋ฐฐ์น˜์˜ ์˜ค๋ฒ„ํ—ค๋“œ๊ฐ€ ๋ถ€๋‹ด๋  ์ˆ˜ ์žˆ์Œ


9. ๊ฒฐ๋ก 

PCHands๋Š” ๋กœ๋ด‡ ์กฐ์ž‘ ๋ถ„์•ผ์—์„œ ์˜ค๋žซ๋™์•ˆ ๋‹ค๋ฃจ์–ด์ง„ ๋ฌธ์ œ, ์ฆ‰ ๋‹ค์–‘ํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ฐ„์˜ ํ‘œํ˜„ ํ†ตํ•ฉ์— ๋Œ€ํ•ด ์‹ค์šฉ์ ์ด๊ณ  ํšจ๊ณผ์ ์ธ ํ•ด๋ฒ•์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์•„์ด๋””์–ด์ธ โ€œCVAE๋กœ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ฐ„ ์ฐจ์ด๋ฅผ ๋ชจ๋ธ๋งํ•˜๊ณ , PCA๋กœ ์ž์„ธ ์‹œ๋„ˆ์ง€๋ฅผ ์ถ”์ถœํ•œ๋‹คโ€๋Š” ๋‹จ์ˆœํ•˜์ง€๋งŒ ๊ฐ•๋ ฅํ•ฉ๋‹ˆ๋‹ค. ์ด ์กฐํ•ฉ์€:

  • 17์ข…์˜ ๋งค์šฐ ๋‹ค๋ฅธ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด ์ผ๊ด€๋œ ์˜๋ฏธ๋ก ์  ํ‘œํ˜„ ํ•™์Šต
  • ๊ธฐ์กด baseline ๋Œ€๋น„ ๋” ๋น ๋ฅธ RL ์ˆ˜๋ ด
  • ๊ต์ฐจ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๋ฐ๋ชจ ํ™œ์šฉ ๊ฐ€๋Šฅ
  • ์‹ค์„ธ๊ณ„ ์ „์ด์—์„œ๋„ ์œ ๋งํ•œ ๊ฒฐ๊ณผ

๋ฌผ๋ก  ํ•œ๊ณ„๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ˆ˜๋™ ์•ต์ปค ๋ฐฐ์น˜, ํœด๋ฆฌ์Šคํ‹ฑํ•œ ์ฐธ์กฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์„ ํƒ, ๋น„์ „ ๊ธฐ๋ฐ˜ ์ถ”์  ์˜์กด์„ฑ ๋“ฑ์€ ํ–ฅํ›„ ํ•ด๊ฒฐํ•ด์•ผ ํ•  ๊ณผ์ œ์ž…๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ โ€œ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ํ†ตํ•ฉํ•˜์—ฌ ๋ฒ”์šฉ์ ์ธ ์กฐ์ž‘ ์ •์ฑ…์„ ํ•™์Šตํ•  ๊ฒƒ์ธ๊ฐ€?โ€๋ผ๋Š” ์งˆ๋ฌธ์— ๋Œ€ํ•ด, PCHands๋Š” ๋ถ„๋ช… ํ•œ ๊ฑธ์Œ ๋‚˜์•„๊ฐ„ ๋Œ€๋‹ต์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๋กœ๋ด‡ foundation model์„ ํ–ฅํ•œ ์—ฌ์ •์—์„œ, ์ด๋Ÿฌํ•œ ํ‘œํ˜„ ํ•™์Šต ์—ฐ๊ตฌ๋Š” ํ•„์ˆ˜์ ์ธ ๋นŒ๋”ฉ ๋ธ”๋ก์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

โ›๏ธ Dig Review

โ›๏ธ Dig โ€” Go deep, uncover the layers. Dive into technical detail.

์ตœ๊ทผ ๋กœ๋ด‡๊ณตํ•™์—์„œ๋Š” ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ์•„์šฐ๋ฅด๋Š” ์ผ๋ฐ˜ํ™”๋œ ์กฐ์ž‘ ๋ชจ๋ธ์ด ์š”๊ตฌ๋œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ธ๊ฐ„์˜ ์† ์›€์ง์ž„ ๋ฐ์ดํ„ฐ๋Š” ํ’๋ถ€ํ•˜์ง€๋งŒ, ๋กœ๋ด‡์˜ ๋ฐ์ดํ„ฐ์…‹์€ ๊ทœ๋ชจ๊ฐ€ ์ž‘๊ณ (์˜ˆ: ๊ทธ๋ฆฌํผ ์ค‘์‹ฌ) ๊ณ  ์ž์œ ๋„ ์กฐ์ž‘ ๋ฐ์ดํ„ฐ๋Š” ๋”์šฑ ๋ถ€์กฑํ•˜๋‹ค. ๋˜ํ•œ ๊ด€์ ˆ ๊ณต๊ฐ„(Joint Angle Space, JAS)์—์„œ ์ง์ ‘ ์ œ์–ดํ•˜๋Š” ๋ฐฉ์‹์€ ์ž์œ ๋„๊ฐ€ ๋†’์€ ๊ฒฝ์šฐ ํ•™์Šต ํšจ์œจ์ด ๋–จ์–ด์ง€๊ณ , ๋‹จ์ˆœํžˆ ์—”๋“œ์ดํŽ™ํ„ฐ์˜ ์นด๋ฅดํ…Œ์‹œ์•ˆ ์ œ์–ด(Cartesian Space, CAS)๋งŒ์œผ๋กœ๋Š” ์ •๋ฐ€ํ•œ ์† ๋ชจ์–‘ ์ œ์–ด๊ฐ€ ์–ด๋ ต๋‹ค. ์ด์— ๋ณธ ๋…ผ๋ฌธ โ€œPCHands: PCA-based Hand Pose Synergy Representation on Manipulators with N-DoFโ€๋Š” ์—ฌ๋Ÿฌ ๊ธฐ๊ตฌํ•™ ๊ตฌ์กฐ(2~5์ง€ ์†๊ฐ€๋ฝ, ์ธ์œ„ ํ•ธ๋“œ ๋“ฑ)๋ฅผ ๋ง๋ผํ•˜๋Š” ํ†ต์ผ๋œ ์† ์ž์„ธ ํ‘œํ˜„์„ ์ œ์•ˆํ•œ๋‹ค. ํ•ต์‹ฌ ์•„์ด๋””์–ด๋Š” ์•ต์ปค(Anchor) ํฌ์ธํŠธ๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋“  ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ์†/๊ทธ๋ฆฌํผ ํ˜•ํƒœ๋ฅผ ํ‘œํ˜„ํ•˜๊ณ , ๋ณ€๋™์ด ์žˆ๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ตฌ์„ฑ ์ „์ฒด๋ฅผ CVAE๋กœ ์ž ์žฌ๊ณต๊ฐ„์— ์ธ์ฝ”๋”ฉํ•œ ๋’ค, ์ด๋ฅผ PCA๋กœ ๋ถ„ํ•ดํ•˜์—ฌ ๊ณตํ†ต๋œ ํฌ์ฆˆ ์‹œ๋„ˆ์ง€(postural synergy)๋ฅผ ์ถ”์ถœํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋•Œ ์–ป์–ด์ง€๋Š” ์ฃผ์„ฑ๋ถ„(principal components) ์€ ๋ชจ๋“  ๊ธฐ๊ตฌ์— ๊ฑธ์ณ ๋ณดํŽธ์ ์œผ๋กœ ์† ๊ฐœํ(opening) ๊ฐ™์€ ์šด๋™ ํŒจํ„ด์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.

๊ทธ๋ฆผ.1. PCHands์˜ ์•ต์ปค(Anchor) ๋ฐฐ์น˜ ์˜ˆ์‹œ. ๊ฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— 22๊ฐœ์˜ ์•ต์ปค๋ฅผ ์ •์˜ํ•˜์—ฌ ์†๊ฐ€๋ฝ๊ณผ ํŒœ(palm)์„ ๋Œ€ํ‘œํ•œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, 5์ง€ ์†์—์„œ๋Š” ๊ฐ ์†๊ฐ€๋ฝ์— ๊ทผ์œ„, ์ค‘๊ฐ„, ์›์œ„, ์†๊ฐ€๋ฝ ๋(phalanx)์— ๊ฐ๊ฐ ์•ต์ปค๋ฅผ ๋ฐฐ์น˜ํ•˜๊ณ , 2์ง€ ๊ทธ๋ฆฌํผ์—์„œ๋Š” ํ•œ์ชฝ ์กฐ์ด์Šคํ‹ฑ(jaw)์— ์ขŒ์ธก ์—„์ง€ ๋ชจํ˜• ์•ต์ปค 4๊ฐœ, ๋‹ค๋ฅธ ์ชฝ์— 16๊ฐœ๋ฅผ ํ• ๋‹นํ•˜๋Š” ์‹์ด๋‹ค. ์•ต์ปค ํฌ์ธํŠธ๋Š” ๊ฐ๊ฐ์˜ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๊ธฐ๊ตฌํ•™์—์„œ ์ ์ ˆํžˆ ์ •์˜๋œ 3D ์ ์œผ๋กœ, ์ด๋“ค์„ ํ†ต์ผ๋œ ๊ธฐ์ €(coordinate frame)์—์„œ ํ‘œํ˜„ํ•จ์œผ๋กœ์จ ์„œ๋กœ ๋‹ค๋ฅธ ์ž์œ ๋„์˜ ๋กœ๋ด‡๋„ ๊ฐ™์€ ์ฐจ์›์—์„œ ๋น„๊ตยทํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค. PCHands์—์„œ๋Š” ์ดˆ๊ธฐ ์—”๋“œ์ดํŽ™ํ„ฐ ํ”„๋ ˆ์ž„์„ ์†๋ชฉ ๋ฐฉํ–ฅ ๋“ฑ์œผ๋กœ ์ •์˜ํ•œ ๋’ค(๊ทธ๋ฆผ ์ฐธ์กฐ), Iterative Closest Point(ICP) ๋“ฑ์„ ํ†ตํ•ด ๋ชจ๋“  ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์˜ ์•ต์ปค ์ขŒํ‘œ๊ณ„๋ฅผ ์ •๋ ฌํ•จ์œผ๋กœ์จ ํ‘œํ˜„์˜ ์ผ๊ด€์„ฑ์„ ํ™•๋ณดํ•œ๋‹ค.

์ˆ˜ํ•™์  ๊ธฐ๋ฒ•: PCA, CVAE, ๊ทธ๋ฆฌ๊ณ  ์•ต์ปค ๊ธฐ๋ฐ˜ ๋งคํ•‘

PCHands์˜ ์ˆ˜ํ•™์  ๊ธฐ๋ฐ˜์€ ํฌ๊ฒŒ ์ฐจ์› ์ถ•์†Œ ๊ธฐ๋ฒ•๊ณผ ์šด๋™ํ•™์  ๋งคํ•‘์œผ๋กœ ๋‚˜๋‰œ๋‹ค. ์šฐ์„  ์ฃผ์„ฑ๋ถ„๋ถ„์„(PCA)์€ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ ๊ณต๋ถ„์‚ฐ์ด ๊ฐ€์žฅ ํฐ ๋ฐฉํ–ฅ(์ฃผ์„ฑ๋ถ„)์œผ๋กœ ์„ ํ˜• ๋ถ„ํ•ดํ•ด ํ•˜์œ„ ๊ณต๊ฐ„์— ํ‘œํ˜„ํ•œ๋‹ค. ์› ๋…ผ๋ฌธ์— ๋”ฐ๋ฅด๋ฉด, PCA๋Š” ๋กœ๋ด‡ ํ•ธ๋“œ์˜ ์ž์„ธ ๊ณต๊ฐ„์„ ์ €์ฐจ์›ํ™”ํ•˜์—ฌ ๊ทธ๋ฆฝ(grasp) ํฌ์ฆˆ์˜ ์‹œ๋„ˆ์ง€(์ฃผ์š” ๊ธฐ์ €)๋ฅผ ์ฐพ๋Š” ๋ฐ ์“ฐ์—ฌ ์™”๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, Santello ๋“ฑ์˜ ์—ฐ๊ตฌ์—์„œ๋Š” ์ธ๊ฐ„ ์† ์ž์„ธ์˜ ๋ช‡ ๊ฐœ ์ฃผ์„ฑ๋ถ„๋งŒ์œผ๋กœ ๋Œ€๋ถ€๋ถ„์˜ ๋ณ€๋™(โ‰ง80%)์„ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค(๋‡Œ๊ณผํ•™ ๋ถ„์•ผ). PCHands๋Š” ์ด๋Ÿฌํ•œ ํฌ์ฆˆ ์‹œ๋„ˆ์ง€ ๊ฐœ๋…์„ ๋ฉ€ํ‹ฐ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ์ ์šฉํ•œ๋‹ค. ํ•˜์ง€๋งŒ ๋‹จ์ˆœ PCA๋กœ๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋งˆ๋‹ค ํ˜•์ƒ์ด ๋‹ค๋ฅด๋ฏ€๋กœ ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„์ด ์† ๊ตฌ๋ถ€๋ฆผ ๋Œ€์‹  ๊ตฌ์กฐ ์ฐจ์ด์— ๋ฏผ๊ฐํ•ด์ง€๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์กฐ๊ฑด๋ถ€ ๋ณ€์ดํ˜• ์˜คํ† ์ธ์ฝ”๋”(CVAE)๋ฅผ ๋จผ์ € ์ ์šฉํ•˜์—ฌ ๋น„์„ ํ˜• ์ž ์žฌ ํ‘œํ˜„์„ ํ•™์Šตํ•œ ํ›„, ์ด ์ž ์žฌ๊ณต๊ฐ„์— PCA๋ฅผ ์ ์šฉํ•˜๋Š” ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๋ฐฉ์‹์„ ์“ด๋‹ค. CVAE๋Š” ์ž…๋ ฅ(์•ต์ปค ์œ„์น˜)๊ณผ ์ถœ๋ ฅ(์žฌ๊ตฌ์„ฑ) ์‚ฌ์ด์˜ ์˜คํ† ์ธ์ฝ”๋”์— ์ž ์žฌ ๋ถ„ํฌ ์ •๊ทœํ™”(KL ๋ฐœ์‚ฐ) ํ•ญ์„ ๋”ํ•œ ์‹ ๊ฒฝ๋ง์œผ๋กœ, here๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ID๋ฅผ ์กฐ๊ฑด(condition)์œผ๋กœ ์ฃผ์–ด ํ•™์Šตํ•œ๋‹ค. CVAE์˜ ์†์‹คํ•จ์ˆ˜๋Š” L1 ์žฌ๊ตฌ์„ฑ ์˜ค์ฐจ์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๊ณ , ์ž ์žฌ๊ณต๊ฐ„์€ ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๋„๋ก ํ›ˆ๋ จ๋œ๋‹ค. ์ด๋กœ์จ ์„œ๋กœ ๋‹ค๋ฅธ ๋กœ๋ด‡ ๊ตฌ์„ฑ์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋Š” ์ž์„ธ ๋ณ€ํ™”๋ฅผ ๊ณตํ†ต์˜ ์ž ์žฌ๋ฒกํ„ฐ z ๋กœ ์žก์•„๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

์ดํ›„ PCA Reduction ๋‹จ๊ณ„์—์„œ, CVAE์˜ ์ž ์žฌ๋ฒกํ„ฐ z์— ๋Œ€ํ•ด ์„ ํ˜• PCA๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ z์˜ ์ฃผ์š” ์„ฑ๋ถ„(PC)์„ ๊ตฌํ•œ๋‹ค. ์ด ๋ณ€์ˆ˜๊ธธ์ด(latent)์˜ PCA ๊ฒฐ๊ณผ๊ฐ€ ์† ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์ด๋ฉฐ, ์›ํ•˜๋Š” ์ฐจ์› ์ˆ˜(์˜ˆ: 1~10PC)๋ฅผ ์ž์œ ๋กญ๊ฒŒ ์„ ํƒํ•  ์ˆ˜ ์žˆ์–ด ์œ ์—ฐํ•˜๋‹ค. ๋…ผ๋ฌธ์—์„œ๋Š” ์ฒซ์งธ PC๊ฐ€ ๋ชจ๋“  ์†์—์„œ ์†์„ ๊ฐœ๋ฐฉ(open)ํ•˜๋Š” ๋ณดํŽธ์  ์šด๋™์ž„์„ ๊ด€์ฐฐํ–ˆ๋‹ค. ์‹ค์ œ ์กฐ์ž‘ ์ค‘์—๋Š” ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜๋ฅผ ๊ด€์ฐฐ(observation)๊ณผ ํ–‰๋™(action) ๊ณต๊ฐ„์œผ๋กœ ์“ฐ๋Š”๋ฐ, ์˜ˆ๋ฅผ ๋“ค์–ด RL ์—์ด์ „ํŠธ๋Š” 1~N๊ฐœ์˜ PC ๊ณ„์ˆ˜๋ฅผ ์ž…๋ ฅ/์ถœ๋ ฅ์œผ๋กœ ํ™œ์šฉํ•œ๋‹ค. ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์—์„œ ๊ธฐ๊ณ„๋งˆ๋‹ค ์ˆœ์ „๋ฐฉ ์šด๋™ํ•™์œผ๋กœ ์•ต์ปค ์œ„์น˜๋ฅผ ๊ตฌํ•ด CVAE ์ธ์ฝ”๋”๋กœ latent z๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , ๋‹ค์‹œ PCA ๊ณ„์ˆ˜๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ encode pass๋ฅผ ์™„์„ฑํ•œ๋‹ค. ๋ฐ˜๋Œ€๋กœ decode pass์—์„œ๋Š” PC ๊ณ„์ˆ˜๋ฅผ CVAE๋กœ ์—ญ๋ณ€ํ™˜ํ•˜์—ฌ ์•ต์ปค ์œ„์น˜๋ฅผ ์žฌ๊ตฌ์„ฑํ•˜๊ณ , ์—ญ์šด๋™ํ•™(optimization)์œผ๋กœ ๊ฐ ๊ด€์ ˆ๊ฐ๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋กœ๋ด‡ ์ž์„ธ๋ฅผ ์–ป๋Š”๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ(PC ๋ณ€ํ™˜)์€ ๊ณตํ†ต์œผ๋กœ ๋‘๊ณ , ๊ฐ ๋กœ๋ด‡์˜ ๊ธฐ๊ตฌ์ธต(forward/inverse kinematics)๋งŒ ๋ถ„๋ฆฌํ•˜์—ฌ ์ ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ํ•˜๋“œ์›จ์–ด ๋…๋ฆฝ์ ์œผ๋กœ ์‹œ๋„ˆ์ง€๋ฅผ ์“ด๋‹ค.

๋งˆ์ง€๋ง‰์œผ๋กœ, ์•ต์ปค ์ขŒํ‘œ๊ณ„๋ฅผ ์ •๋ ฌํ•˜๊ธฐ ์œ„ํ•œ ํ”„๋ ˆ์ž„ ๋ณด์ • ์ ˆ์ฐจ๋ฅผ ๋ฐ˜๋ณต ํ•™์Šต์— ํฌํ•จํ•œ๋‹ค. ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ ˆํผ๋Ÿฐ์Šค์šฉ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ(์˜ˆ: 2F, 3F, ์•Œ๋งˆํŠธ ์† ๋“ฑ)๋ฅผ ๋ช‡ ๊ฐ€์ง€ ๊ณ ์ •ํ•ด๋‘๊ณ , ์ฒซ ์ฃผ์„ฑ๋ถ„ ์ƒ์—์„œ ๋ณด๊ฐ„๋œ ๋‹ค์ˆ˜์˜ ํฌ์ธํŠธ๋ฅผ ๋ฝ‘์•„ ๋””์ฝ”๋”ฉํ•œ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ๊ฐ ๋ชจ์…˜๋งˆ๋‹ค ๋ ˆํผ๋Ÿฐ์Šค ๋กœ๋ด‡๋“ค์˜ ์•ต์ปค ํ‰๊ท  ์œ„์น˜์™€, ํƒ€๊ฒŸ ๋กœ๋ด‡์˜ ์•ต์ปค ์œ„์น˜ ์ฐจ์ด๋ฅผ ICP๋กœ ๊ณ„์‚ฐํ•˜์—ฌ ํƒ€๊ฒŸ์˜ ์—”๋“œ์ดํŽ™ํ„ฐ ๊ธฐ์ค€๊ณ„๋ฅผ ๋ฏธ์„ธ ์กฐ์ •ํ•œ๋‹ค. ์ด ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•˜๋ฉด, ๋ชจ๋“  ๋กœ๋ด‡์˜ ์•ต์ปค ํ‘œํ˜„์ด ๊ฐ€๋Šฅํ•œ ํ•œ ์ผ๊ด€๋˜๊ฒŒ ์ •๋ ฌ๋œ๋‹ค. ์š”์•ฝํ•˜๋ฉด, PCHands๋Š” ์•ต์ปค ์„ค๋ช… ํฌ๋งท(ADF) โ†’ CVAE ํ•™์Šต โ†’ PCA ์ถ•์†Œ โ†’ ํ”„๋ ˆ์ž„ ์ •๋ ฌ์ด๋ผ๋Š” ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ์ž‘๋™ํ•˜์—ฌ ๊ณตํ†ต์˜ ์ €์ฐจ์› ์† ์ž์„ธ ์‹œ๋„ˆ์ง€ ๊ณต๊ฐ„์„ ๋งŒ๋“ ๋‹ค.

JAS, CAS, ์‹œ๋„ˆ์ง€ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ๊ณผ ๋น„๊ต

๋กœ๋ด‡ ์† ๋งคํ•‘ ๋˜๋Š” ์ œ์–ด ๋ฐฉ์‹์€ ํฌ๊ฒŒ ์ง์ ‘ ๊ด€์ ˆ (JAS), ์นด๋ฅดํ…Œ์‹œ์•ˆ ๊ณต๊ฐ„ (CAS), ์ฐจ์› ์ถ•์†Œ(์‹œ๋„ˆ์ง€), ์ง์ ‘ ์ž‘์—…(task-oriented) ๋“ฑ์˜ ๋ฒ”์ฃผ๋กœ ๊ตฌ๋ถ„๋œ๋‹ค. Meattini ๋“ฑ์˜ ๋ถ„๋ฅ˜์— ๋”ฐ๋ฅด๋ฉด, JAS(Direct Joint)๋Š” ๊ฐ ๊ด€์ ˆ๊ฐ์„ ์ง์ ‘ ๋ช…๋ นํ•˜๋Š” ๋ฐฉ์‹์ด๊ณ , CAS(Direct Cartesian)๋Š” ์†๊ฐ€๋ฝ ๋ง๋‹จ ๋˜๋Š” ์—”๋“œ์ดํŽ™ํ„ฐ์˜ ์œ„์น˜/์ž์„ธ๋ฅผ ๋ช…๋ นํ•˜์—ฌ ๋ชจ์…˜์„ ์ƒ์„ฑํ•œ๋‹ค. ํ•œํŽธ ์‹œ๋„ˆ์ง€ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹์€ (์˜ˆ: PCA ๋“ฑ) ์ €์ฐจ์› ์ž ์žฌ๋ฒกํ„ฐ๋ฅผ ํ†ตํ•ด ๊ณ ์ฐจ์› ๊ด€์ ˆ์„ ์ œ์–ดํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. ๊ธฐ์กด ์—ฐ๊ตฌ์—์„œ ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ์€ ์ธ๊ฐ„ ์† ๋ฐ์ดํ„ฐ๋‚˜ ๋กœ๋ด‡ ์† ๊ตฌ์„ฑ์˜ ์ฃผ์„ฑ๋ถ„์„ ์ฐพ๋Š”๋ฐ ์‚ฌ์šฉ๋˜์—ˆ์œผ๋ฉฐ, ์ด๋ฅผ ์ด์šฉํ•ด ์ธ์ฒด ์‹œ์—ฐ์„ ๋กœ๋ด‡์œผ๋กœ ๋ฆฌํƒ€๊ฒŒํŒ…ํ•˜๊ฑฐ๋‚˜(bimanual teleop ๋“ฑ) ๊ทธ๋ฆฝ ์ƒ์„ฑ ๋“ฑ์— ํ™œ์šฉํ•ด ์™”๋‹ค.

PCHands๋Š” ์ด ์ค‘ ์‹œ๋„ˆ์ง€ ์ฐจ์› ์ œ์–ด๋ฅผ ์„ ํƒํ•˜์—ฌ JAS ๋ฐ CAS ๋Œ€๋น„ ์žฅ์ ์„ ๋ณด์ธ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๋ณธ ๋…ผ๋ฌธ์˜ RL ์‹คํ—˜์—์„œ๋Š” 16~18 ์ž์œ ๋„๋ฅผ ๊ฐ€์ง„ Allegro, Schunk, Shadow์™€ ๊ฐ™์€ ํ•ธ๋“œ ์กฐ์ž‘๊ธฐ๋“ค์„ ๋Œ€์ƒ์œผ๋กœ ํ•œ๋‹ค. ๊ธฐ์ค€์„ (๋ฒ ์ด์Šค๋ผ์ธ) ๋ฐฉ๋ฒ•์€ JAS๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ๊ด€์ ˆ๊ฐ์„ ์ง์ ‘ ์ œ์–ดํ•œ๋‹ค. ๋ฐ˜๋ฉด PCHands์—์„œ๋Š” 1~N๊ฐœ์˜ ์ฃผ์„ฑ๋ถ„ ๊ณ„์ˆ˜๋งŒ์œผ๋กœ (์—ฌ๊ธฐ์— ์—”๋“œ์ดํŽ™ํ„ฐ ์œ„์น˜๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” flying-hand mode) ์ œ์–ด๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ํ•™์Šต ์†๋„์™€ ์ผ๊ด€์„ฑ ์ธก๋ฉด์—์„œ PCHands๊ฐ€ JAS ๊ธฐ๋ฐ˜ ๋Œ€๋น„ ์šฐ์ˆ˜ํ–ˆ๋‹ค. ํŠนํžˆ 1~2๊ฐœ์˜ PC๋งŒ ์‚ฌ์šฉํ•ด๋„ 16~18 ์ž์œ ๋„์˜ ๋กœ๋ด‡ ์†์„ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Œ์ด ๊ด€์ฐฐ๋˜์—ˆ๋‹ค. ๋˜ํ•œ PCHands๋Š” ์ฃผ์„ฑ๋ถ„๊ณผ ๋ณ„๊ฐœ๋กœ ์—”๋“œ์ดํŽ™ํ„ฐ ์œ„์น˜ ๋ช…๋ น(CAS)์„ ๋™์‹œ์— ์ทจ๊ธ‰(flying-hand mode)ํ•  ์ˆ˜ ์žˆ์–ด, ๊ธฐ์กด CAS์˜ ์žฅ์ (๋ฌผ์ฒด ์œ„์น˜ ์ œ์–ด)๊ณผ ์‹œ๋„ˆ์ง€์˜ ์žฅ์ (์ €์ฐจ์› ์† ๋ชจ์–‘ ์ œ์–ด)์„ ๋ชจ๋‘ ๊ฐ€์ง„๋‹ค.

์ •์„ฑ์ ์œผ๋กœ๋„, PCHands๊ฐ€ ์ถ”์ถœํ•œ ์ฒซ์งธ ์ฃผ์„ฑ๋ถ„์€ ๋ชจ๋“  ๊ธฐ๊ตฌ์—์„œ ๊ณตํ†ต์ ์ธ ์†-๊ฐœํ ์šด๋™์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ๊ทธ๋ฆผ 3 ํ•˜๋‹จ๊ณผ ๊ทธ๋ฆผ 4๋ฅผ ๋ณด๋ฉด, ์ˆœ์ˆ˜ PCA(โ€œvanilla PCAโ€)๋กœ๋Š” ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋งˆ๋‹ค ๋ถ„ํฌ๊ฐ€ ๊ตฌ์กฐ์— ๋”ฐ๋ผ ๊ตฐ์ง‘ํ™”๋˜์ง€๋งŒ, CVAE+PCA(PCHands)์—์„œ๋Š” ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ์ฐจ์ด๊ฐ€ ์™„ํ™”๋˜๊ณ  ์† ๋ชจ์–‘ ๋ณ€ํ™”๊ฐ€ ์ฃผ์ถ•์œผ๋กœ ์žกํžŒ๋‹ค(๊ทธ๋ฆผ 3). ์ฆ‰ PCHands๋Š” ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ์† ๋ชจ๋‘์—์„œ ์ผ๊ด€๋œ ์‹œ๋„ˆ์ง€ ๊ธฐ์ €๋ฅผ ํ•™์Šตํ•˜๋ฉฐ, ๋™์ผํ•œ ์‹œ๋„ˆ์ง€ ์ขŒํ‘œ๋ฅผ ์‚ฌ์šฉํ•ด ์„œ๋กœ ๋‹ค๋ฅธ ๋กœ๋ด‡ ๊ฐ„์— ์ž์„ธ๋ฅผ ์žฌํ˜„(retargeting)ํ•  ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์‚ฌ๋žŒ ์†์˜ ์ž์„ธ๋ฅผ ์ž„์˜๋กœ ๋ณ€ํ˜•ํ•œ ์˜์ƒ์„ 5์ง€๋‚˜ 2์ง€ ๋กœ๋ด‡์— ๊ทธ๋Œ€๋กœ ๋งคํ•‘ํ•  ๋•Œ, PCHands๋ฅผ ์“ฐ๋ฉด ๋ณต์žกํ•œ ์ตœ์ ํ™” ๊ณผ์ • ์—†์ด๋„ ์–‘์ชฝ ๋ชจ๋‘ ์ž์—ฐ์Šค๋Ÿฌ์šด ์ž์„ธ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

์ข…ํ•ฉํ•˜๋ฉด, ๋น„๊ต ์‹คํ—˜์—์„œ PCHands๋Š” JAS ๊ธฐ๋ฐ˜ ๊ฐ•ํ™”ํ•™์Šต ๋Œ€๋น„ ํ•™์Šต ํšจ์œจ ๋ฐ ์„ฑ๊ณต๋ฅ  ๊ฐœ์„ ์„ ๋ณด์—ฌ์ฃผ์—ˆ์œผ๋ฉฐ, ์ด๋Š” ์ €์ฐจ์› ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์ด ๋™์ผํ•œ ์ž‘์—…์„ ๋” ๋น ๋ฅด๊ณ  ์•ˆ์ •์ ์œผ๋กœ ํ•™์Šตํ•˜๋„๋ก ๋•๋Š” ๊ฒฐ๊ณผ๋‹ค. ๋˜ํ•œ PCHands๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ๊ทผ์›(์†Œ์Šค) ๋กœ๋ด‡์˜ demonstration์„ ํ™œ์šฉํ•œ ๋ฐ๋ชจ ์ฆ๊ฐ• ํ•™์Šต(DAPG)์—์„œ๋„ ๊ฐ•์ธํ•จ์„ ๋ณด์˜€๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, 2์ง€/3์ง€/4์ง€ ๋กœ๋ด‡์—์„œ ์ˆ˜์ง‘ํ•œ ์‹œ์—ฐ์„ ๋‹ค๋ฅธ ํ•ธ๋“œ์— ์ ์šฉํ•ด๋„ PCHands๋ฅผ ์“ด ์ •์ฑ…์ด ์ผ๊ด€๋˜๊ฒŒ ๋†’์€ ์„ฑ๋Šฅ์„ ๋‚ด๋ฉฐ, ์‹œ์—ฐ์ด ๊ฐ™์€ ํ˜•ํƒœ์˜ ๋กœ๋ด‡์—์„œ ์˜จ ๊ฒฝ์šฐ๋ณด๋‹ค ๋”ฑํžˆ ํŽ˜๋„ํ‹ฐ๊ฐ€ ํฌ์ง€ ์•Š์Œ์„ ํ™•์ธํ–ˆ๋‹ค. ์ด๋Š” PCHands์˜ ์ž ์žฌ ์‹œ๋„ˆ์ง€ ์ขŒํ‘œ๋กœ ๋ณ€ํ™˜ํ•จ์œผ๋กœ์จ ์œ ์šฉํ•œ ์ •๋ณด๊ฐ€ ์†์‹ค๋˜์ง€ ์•Š๊ณ  ์žฌ์‚ฌ์šฉ๋จ์„ ์‹œ์‚ฌํ•œ๋‹ค. ๋‹ค๋งŒ, ์ „ํ†ต์ ์ธ ์‹œ๋„ˆ์ง€ ๋ชจ๋ธ(์˜ˆ: Santello ๋ฐฉ์‹)์ด๋‚˜ ๊ธฐํƒ€ ์‹œ๋„ˆ์ง€ ํ™•์žฅ(โ€œSynergy+โ€๋ผ๊ณ  ๋ถ€๋ฅผ ๋งŒํ•œ ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•)๊ณผ PCHands์˜ ์ง์ ‘ ๋น„๊ต ์‹คํ—˜์€ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์•˜์œผ๋‚˜, PCHands์˜ ์ ‘๊ทผ๋ฒ•์€ ํŠนํžˆ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ํ˜•ํƒœ๊ฐ€ ์ด์งˆ์ ์ผ ๋•Œ ๊ฐ•์ ์„ ๊ฐ€์ง„๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด 2์ง€ ๊ทธ๋ฆฌํผ์™€ 5์ง€ ์†์„ ๋™์ผ ๋ฒกํ„ฐ ๊ณต๊ฐ„์—์„œ ๋‹ค๋ฃจ๋Š” ๊ธฐ์กด ๋ฐฉ๋ฒ•์€ ์—†์—ˆ์œผ๋ฏ€๋กœ PCHands๋Š” ์ด ์ ์—์„œ ๋…์ฐฝ์ ์ด๋‹ค.

์‹คํ—˜ ๊ฒฐ๊ณผ์™€ ๋ถ„์„

์ •์„ฑ์  ์‹œ๋„ˆ์ง€ ๋ถ„์„

PCHands๊ฐ€ ํ•™์Šตํ•œ ์‹œ๋„ˆ์ง€๋ฅผ ๋ถ„์„ํ•œ ๊ฒฐ๊ณผ, ์ฒซ ๋ฒˆ์งธ ์ฃผ์„ฑ๋ถ„(PC1)์ด ๊ฐ€์žฅ ์˜๋ฏธ ์žˆ์—ˆ๋‹ค. ๋…ผ๋ฌธ์— ๋”ฐ๋ฅด๋ฉด PC1์€ 17๊ฐœ ๋กœ๋ด‡(๋‹ค์–‘ํ•œ ์ž์œ ๋„ ํฌํ•จ)์—์„œ ๊ณตํ†ต์˜ ์†-๊ฐœํ ์šด๋™์„ ๋‚˜ํƒ€๋ƒˆ๋‹ค. ์‹ค์ œ๋กœ Fig.4๋ฅผ ๋ณด๋ฉด PC1์„ +3~-3 ๋ฒ”์œ„๋กœ ์›€์ง์˜€์„ ๋•Œ ๋ชจ๋“  ๋กœ๋ด‡์ด โ€œ์†์„ ํ™œ์ง ํŽธ(open)โ€ ๋ชจ์–‘์—์„œ โ€œ์ฃผ๋จน ์ฅ”(closed)โ€ ๋ชจ์–‘๊นŒ์ง€ ๋ณ€ํ™”ํ•œ๋‹ค. 16๊ฐœ ๊ฐ•์ฒด ๋กœ๋ด‡๊ณผ 1๊ฐœ์˜ ๋น„๊ฐ•์ฒด(MANO ๋ชจ๋ธ) ์† ๋ชจ๋‘ ๊ฐ™์€ PC1 ์ถ•์„ ๊ณต์œ ํ•˜์—ฌ ์ผ๊ด€๋œ ๋™์ž‘์„ ๋ณด์ธ ๊ฒƒ์ด๋‹ค. ์ด๋Š” PCHands๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ํ˜•์ƒ์˜ ๋กœ๋ด‡ ์‚ฌ์ด์—์„œ ๋™์ผํ•œ ์ž ์žฌ ํ‘œํ˜„์„ ๊ณต์œ ํ•จ์„ ์˜๋ฏธํ•œ๋‹ค.

๋˜ํ•œ CVAE์˜ ์ž ์žฌ ๊ณต๊ฐ„์— ๋Œ€ํ•œ PCA ์ ์šฉ ์ „๊ณผ ํ›„๋ฅผ ๋น„๊ตํ•˜๋ฉด(๊ทธ๋ฆผ 3 ํ•˜๋‹จ), ๋‹จ์ˆœ PCA ๋งŒ์œผ๋กœ๋Š” ์† ๋ชจ์–‘๋ณด๋‹ค๋Š” ๋กœ๋ด‡ ์ข…๋ฅ˜๋งˆ๋‹ค์˜ ์ฐจ์ด๊ฐ€ PC1์— ๋ฐ˜์˜๋˜์—ˆ๋‹ค. ๋ฐ˜๋ฉด PCHands(CVAE+PCA)์—์„œ๋Š” ๊ตฌ์กฐ ์ฐจ์ด๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ์† ๋ชจ์–‘์˜ ๋ณ€ํ™”๋ฅผ ๊ฐ•์กฐํ•จ์œผ๋กœ์จ ์ง„์ •ํ•œ ์‹œ๋„ˆ์ง€ ํ‘œํ˜„์„ ๋ฝ‘์•„๋‚ผ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์ฆ‰, CVAE๋กœ ์ธํ„ฐ-๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ๋ณ€ํ˜•์„ ์–ด๋А ์ •๋„ ๋ชจ๋ธ๋งํ•ด์ค€ ๋‹ค์Œ PCA๋ฅผ ์”Œ์šฐ๋ฉด, ์„œ๋กœ ๋‹ค๋ฅธ ๊ตฌ์กฐ ๊ฐ„ ํŽธ์ฐจ๊ฐ€ ์ ์–ด์ง€๊ณ  ๋ชจ๋“  ์†์˜ ์ž์„ธ๋ณ€ํ™”๊ฐ€ ๋™์ผ ์ถ•์—์„œ ๋น„๊ต ๊ฐ€๋Šฅํ•ด์ง„๋‹ค.

๊ฐ•ํ™”ํ•™์Šต ๋ฒค์น˜๋งˆํฌ

PCHands์˜ ์‹คํšจ์„ฑ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด, ์ €์ž๋“ค์€ ๋„ค ๊ฐ€์ง€ ์ƒˆ๋กœ์šด ๊ณผ์ œ๋ฅผ ํฌํ•จํ•œ 5๊ฐ€์ง€ ์„ฌ์„ธํ•œ ์กฐ์ž‘ ์ž‘์—…(Open-Door, Relocate-*, Flip-Mug)์—์„œ ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ๊ฐ ์ž‘์—…์— ๋Œ€ํ•ด 3๊ฐ€์ง€ ๋กœ๋ด‡ ์†(Allegro, Schunk, Shadow)์„ ๋Œ€์ƒ์œผ๋กœ ํ•˜์˜€๊ณ , ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ TRPO์™€ DAPG(๋ฐ๋ชจ ์ฆ๊ฐ• PG)์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๊ธฐ์ค€์„ ์œผ๋กœ๋Š” Qin ๋“ฑ(2022)์˜ ๋ฐฉ๋ฒ•์„ ์ฑ„ํƒํ•˜์—ฌ JAS๋กœ๋งŒ ์ œ์–ดํ•˜๋Š” ์ •์ฑ…์„ ๋น„๊ต๋Œ€์ƒ์œผ๋กœ ์‚ผ์•˜๋‹ค.

์‹คํ—˜ ๊ฒฐ๊ณผ, ํ•™์Šต ๊ณก์„ ๊ณผ ์ตœ์ข… ์„ฑ๊ณต๋ฅ ์—์„œ PCHands๊ฐ€ ์ผ๊ด€๋˜๊ฒŒ ์šฐ์ˆ˜ํ–ˆ๋‹ค. Fig.5 ํ•™์Šต ๊ณก์„ ์— ๋”ฐ๋ฅด๋ฉด, PCHands ์ •์ฑ…์€ ๋Œ€๋ถ€๋ถ„์˜ ๊ณผ์ œ์—์„œ JAS ๊ธฐ๋ฐ˜ ์ •์ฑ…๋ณด๋‹ค ์ˆ˜๋ ด ์†๋„๊ฐ€ ๋น ๋ฅด๊ณ (์ฆ‰, ๋” ๋นจ๋ฆฌ ๋†’์€ ๋ฆฌํ„ด ๋‹ฌ์„ฑ) ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. TRPO์™€ DAPG ๋ชจ๋‘์—์„œ PCHands๊ฐ€ ์œ ๋ฆฌํ–ˆ์œผ๋ฉฐ, ํŠนํžˆ ๋ฐ๋ชจ๋ฅผ ํ™œ์šฉํ•œ DAPG์—์„œ๋Š” ํ•™์Šต ์ดˆ๊ธฐ๋ถ€ํ„ฐ ํฐ ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ๋ณด์˜€๋‹ค. ๋˜ํ•œ, PCHands๋Š” 1~2PC๋งŒ ์‚ฌ์šฉํ•  ๋•Œ๋„ 4PC๋ฅผ ์“ธ ๋•Œ๋ณด๋‹ค ๋” ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์˜€๋‹ค. ์ด๋Š” 16~18 ์ž์œ ๋„์˜ ์†์กฐ์ž‘์—์„œ๋„ ์‚ฌ์‹ค์ƒ 2์ฐจ์› ์ •๋„์˜ ์‹œ๋„ˆ์ง€ ์ฐจ์›์œผ๋กœ ๊ณผ์ œ ์ˆ˜ํ–‰์ด ๊ฐ€๋Šฅํ•จ์„ ์‹œ์‚ฌํ•œ๋‹ค. ์ฆ‰, ๋ณต์žกํ•œ ๋‹ค์ž์œ ๋„ ์‹œ์Šคํ…œ์„ ๊ทน์†Œ์ˆ˜์˜ ์‹œ๋„ˆ์ง€ ์ฐจ์›์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ์ œ์–ดํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•œ๋‹ค.

๋ฐ๋ชจ ๊ธฐ๋ฐ˜ ํ•™์Šต ์‹œ์—๋„ PCHands๊ฐ€ ๊ฒฌ๊ณ ํ•จ์„ ๋ณด์˜€๋‹ค. ๊ฐ™์€ ์ž‘์—…์ด๋ผ๋„ ์„œ๋กœ ๋‹ค๋ฅธ ๋กœ๋ด‡(2F,3F,4F)์œผ๋กœ ์ˆ˜์ง‘๋œ 50๊ฐœ์˜ ์‹œ์—ฐ์„ ์‚ฌ์šฉํ•ด ์ •์ฑ…์„ ํ•™์Šตํ–ˆ๋Š”๋ฐ, PCHands๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์†Œ์Šค ๋กœ๋ด‡์ด ๋‹ฌ๋ผ๋„ ๋ชฉํ‘œ ๋กœ๋ด‡ ์„ฑ๋Šฅ์ด ๋น„๊ต์  ์ž˜ ์œ ์ง€๋˜์—ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด 2F์—์„œ ์ˆ˜์ง‘ํ•œ ์‹œ์—ฐ์œผ๋กœ 4F ๋กœ๋ด‡์„ ํ•™์Šต์‹œํ‚ฌ ๋•Œ์—๋„, TRPO ๋Œ€๋น„ DAPG(์‹œ์—ฐ ์‚ฌ์šฉ)๊ฐ€ ์ผ๊ด€๋˜๊ฒŒ ์šฐ์ˆ˜ํ•œ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์˜€๊ณ , ์‹ฌ์ง€์–ด 4F ์ž‘์—…์—์„œ๋„ ๋‹ค๋ฅธ ์†Œ์Šค์˜ ์‹œ์—ฐ์œผ๋กœ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ƒˆ๋‹ค. ์ด๋Š” ๋กœ๋ด‡ ํ˜•์ƒ์ด ๋‹ฌ๋ผ๋„ ์‹œ๋„ˆ์ง€ ๊ณต๊ฐ„์—์„œ ์œ ์‚ฌํ•œ ์ž‘์—… ์ •๋ณด๋ฅผ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.

์‹ค์„ธ๊ณ„ ์ ์šฉ

์ถ”๊ฐ€๋กœ ์ €์ž๋“ค์€ PCHands ์ •์ฑ…์„ ์‹ค์ œ ๋กœ๋ด‡์— ์˜ฎ๊ฒจ๋ณด์•˜๋‹ค. Franka-Panda 7์ž์œ ๋„ ๋กœ๋ด‡ ํŒ” ๋๋‹จ์— Robotiq 2F ํ˜น์€ 4F LEAP ์†์„ ๋‹ฌ๊ณ , ์œ„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์ •์ฑ…(0-400 ์—ํ”ผ์†Œ๋“œ ํ•™์Šต๋œ DAPG)์„ ์˜์  ์ƒท(zero-shot)์œผ๋กœ ์‹คํ–‰ํ–ˆ๋‹ค. Relocate ๋“ฑ ๋ช‡๋ช‡ ๊ณผ์ œ์—์„œ 2F/4F ๋ชจ๋‘ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋Œ€๋น„ ์•ฝ๊ฐ„์˜ ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ์žˆ์—ˆ์œผ๋‚˜, ๋Œ€์ฒด๋กœ ์„ฑ๊ณผ๋ฅผ ๋ƒˆ๋‹ค. ํŠนํžˆ 4F์˜ ๊ฒฝ์šฐ ๋ฌผ์ฒด๊ฐ€ ์†๊ฐ€๋ฝ์— ๊ฐ€๋ ค์ง€๋Š” ์‹œ์  ์ถ”์  ๋ฌธ์ œ๋กœ SoupCan ์ž‘์—…์—์„œ ์„ฑ๋Šฅ์ด ๋–จ์–ด์กŒ์œผ๋‚˜, 2F์˜ ๊ฒฝ์šฐ ๋Œ€๋ถ€๋ถ„ ๊ณผ์ œ์—์„œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๊ณผ ๋น„์Šทํ•œ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์˜€๋‹ค. ์ด๋Š” PCHands๋กœ ํ•™์Šตํ•œ ์ •์ฑ…์ด ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ๋„ ์ผ๊ด€๋˜๊ฒŒ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ์Œ์„์‹œ์‚ฌํ•œ๋‹ค.

๋น„ํŒ์  ๋ถ„์„ ๋ฐ ํ–ฅํ›„ ๊ณผ์ œ

PCHands๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ ์‚ฌ์ด์— ๊ณตํ†ต๋œ ์ €์ฐจ์› ์† ์ž์„ธ ํ‘œํ˜„์„ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค. ํŠนํžˆ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ์—†์ด๋„ ๋ฌด์ž‘์œ„ ์•ต์ปค ํฌ์ง€์…˜ ์ƒ˜ํ”Œ๋กœ ํ•™์Šตํ•˜์—ฌ, ๊ธด ํ•™์Šต ์—†์ด๋„ ์ฆ‰์‹œ RL์— ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ ์žฅ์ ์ด ์žˆ๋‹ค. ๋˜ํ•œ ์•ต์ปค+CVAE+PCA ์กฐํ•ฉ์€ ํ‘œํ˜„๋ ฅ๊ณผ ์œ ์—ฐ์„ฑ์„ ๋™์‹œ์— ์ œ๊ณตํ•œ๋‹ค: CVAE๋กœ ๋ณต์žกํ•œ ์ž์„ธ๋ฅผ ํฌ๊ด„ํ•˜๋ฉด์„œ PCA๋กœ ์›ํ•˜๋Š” ์ฐจ์›๋งŒํผ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. ํ•™์Šต ํšจ์œจ ๊ฐœ์„ , ์‹œ์—ฐ ๊ณต์œ , ์‹ค์„ธ๊ณ„ ์ „์ด ๊ฐ€๋Šฅ์„ฑ ๋“ฑ ์‹ค์šฉ์  ์ด์ ๋„ ๋ช…ํ™•ํ•˜๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ๋ช‡ ๊ฐ€์ง€ ๊ฐ€์ •๊ณผ ํ•œ๊ณ„๋„ ์กด์žฌํ•œ๋‹ค. ์ฒซ์งธ, PCHands๋Š” ํฌ์ฆˆ(์•ต์ปค ์œ„์น˜)์— ๊ธฐ๋ฐ˜ํ•˜๋ฏ€๋กœ ๋ฌผ์ฒด ์ ‘์ด‰์ด๋‚˜ ํž˜ ์ œ์–ด ๋“ฑ ๋™์  ์ •๋ณด๋Š” ๋ฐ˜์˜ํ•˜์ง€ ์•Š๋Š”๋‹ค. ์ฆ‰, ๋กœ๋ด‡ ์† ๋์˜ ์ ‘์ด‰๋ ฅ ๋ณ€ํ™”๊นŒ์ง€ ์‹œ๋„ˆ์ง€์— ํฌํ•จํ•˜๋ ค๋ฉด ์ถ”๊ฐ€ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ๋‘˜์งธ, ๋ชจ๋“  ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ์— ๋Œ€ํ•ด 22๊ฐœ ์•ต์ปค๋ฅผ ์ •์˜ํ•˜๊ณ  ์ „/ํ›„๋ฐฉ ์šด๋™ํ•™ ๋ชจ๋ธ์„ ์•Œ์•„์•ผ ํ•˜๋ฏ€๋กœ, ์‹ ํ˜• ๋กœ๋ด‡ ๋„์ž… ์‹œ ์ค€๋น„ ์ž‘์—…์ด ํ•„์š”ํ•˜๋‹ค. ์…‹์งธ, CVAE ํ•™์Šต๊ณผ ICP ์ •๋ ฌ์ด ๋ฐ˜๋ณต๋˜๋ฏ€๋กœ ํ•™์Šต ๊ณผ์ •์ด ๋ณต์žกํ•˜๊ณ  ๊ณ„์‚ฐ ๋น„์šฉ์ด ๊ฝค ๋“ ๋‹ค(๋‹ค์–‘ํ•œ ๋กœ๋ด‡์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ํฌํ•จ). ๋„ท์งธ, ํ‰๊ฐ€ ๊ณผ์ œ๋“ค์ด ๋Œ€๋ถ€๋ถ„ ํ‰๊ท ์ ์ธ ์กฐ์ž‘ ์‹œ๋‚˜๋ฆฌ์˜ค์˜€๊ณ , ๋งค์šฐ ๋ณต์žกํ•˜๊ฑฐ๋‚˜ ๋‹ซํžŒ ํ™˜๊ฒฝ(์–ด๋ ค์šด grasping, ๋ฌผ์ฒด ํšŒ์ „ ๋“ฑ)์—์„œ๋Š” ๋” ๋งŽ์€ ์‹œ๋„ˆ์ง€๊ฐ€ ํ•„์š”ํ•  ์ˆ˜ ์žˆ๋‹ค. ์‹ค์ œ๋กœ 4F(LEAP) ์†์—์„œ๋Š” ๋ฌผ์ฒด ์‹œ๊ฐ ์ถ”์ ์ด ์–ด๋ ค์› ์„ ๋•Œ ์„ฑ๋Šฅ์ด ๋–จ์–ด์กŒ๋Š”๋ฐ, ์ด๋Š” ์‹œ๋„ˆ์ง€ ์ œ์–ด์™€ ์ธ์‹ ์‹œ์Šคํ…œ์˜ ํ†ตํ•ฉ ๋ถ€์žฌ์— ๊ธฐ์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

ํ–ฅํ›„ ์—ฐ๊ตฌ๋กœ๋Š” ๋น„์„ ํ˜• ์ฐจ์› ์ถ•์†Œ ๊ธฐ๋ฒ•(์˜ˆ: GPLVM, ๋น„์„ ํ˜• PCA)์„ ๋„์ž…ํ•˜๊ฑฐ๋‚˜, ์‹œ๋„ˆ์ง€ ๊ณต๊ฐ„์— ์ด‰๊ฐ/ํž˜ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋™์ž‘์„ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ๋” ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ๋ฐ ๊ณผ์ œ, ํŠนํžˆ ๊ณก๋ฉด/๋ณ€ํ˜• ๊ฐ€๋Šฅํ•œ(์œ ์—ฐํ•œ ์†) ๋กœ๋ด‡์— ๋Œ€ํ•œ ํ™•์žฅ๋„ ํ•„์š”ํ•˜๋‹ค. PCHands๋Š” ํฌ์ฆˆ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์˜€์œผ๋‚˜, ํ–ฅํ›„ ์‹œ๊ฐ-์ด‰๊ฐ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•œ ํ•™์Šต์œผ๋กœ ํ™•์žฅํ•˜๋ฉด ๋”์šฑ ๋ฒ”์šฉ์ ์ผ ๊ฒƒ์ด๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์‹ค์ œ ์ ์šฉ ์ธก๋ฉด์—์„œ ์›๊ฒฉ ์กฐ์ž‘(ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜)์ด๋‚˜ ๋‹จ์ผ ๋ชจ๋ธ๋กœ ์—ฌ๋Ÿฌ ๋กœ๋ด‡ ํ•™์Šต์‹œํ‚ค๊ธฐ ๋“ฑ ์‹ค์šฉ์  ์‘์šฉ์„ ๋ชจ์ƒ‰ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค.

๊ฒฐ๋ก : ์‹œ๋„ˆ์ง€ ์ œ์–ด์˜ ์˜์˜์™€ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ

PCHands๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์ž์œ ๋„๋ฅผ ๊ฐ€์ง„ ๋กœ๋ด‡ ์† ์‚ฌ์ด์— ๊ณตํ†ต์˜ ์† ์ž์„ธ ํ‘œํ˜„์„ ์„ฑ๊ณต์ ์œผ๋กœ ํ•™์Šตํ•จ์œผ๋กœ์จ, ๊ณ ์ฐจ์› ๋กœ๋ด‡ ์ œ์–ด ๋ฌธ์ œ๋ฅผ ์ €์ฐจ์› ์‹œ๋„ˆ์ง€ ๊ณต๊ฐ„์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ์ƒˆ๋กœ์šด ๊ธธ์„ ์ œ์‹œํ–ˆ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต ์‹คํ—˜์—์„œ ๋ณด๋“ฏ PCHands๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ํ•™์Šต ํšจ์œจ์ด ํ–ฅ์ƒ๋˜๊ณ , ๊ธฐ์กด ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋ฅผ ๋กœ๋ด‡ ๊ฐ„ ์ „์ดํ•˜์—ฌ ์ œ์–ด ์ •์ฑ…์„ ๋น ๋ฅด๊ฒŒ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” ์˜ˆ๋ฅผ ๋“ค์–ด, ์ธ๊ฐ„์˜ ์† ๋ชจ์…˜์„ ํ•œ ๋ฒˆ๋งŒ ๋…นํ™”ํ•˜์—ฌ ๋กœ๋ด‡ ์† ์—ฌ๋Ÿฌ ๋Œ€๋กœ ์ „์†กํ•˜๊ฑฐ๋‚˜, ๋‹ค์–‘ํ•œ ๋กœ๋ด‡์„ ํ•˜๋‚˜์˜ ํ†ตํ•ฉ๋œ ๋ฐฉ์‹์œผ๋กœ ์ œ์–ดํ•˜๋Š” ๋ฐ ์œ ์šฉํ•  ๊ฒƒ์ด๋‹ค. ํŠนํžˆ ๊ณ ์ž์œ ๋„ ๋กœ๋ด‡ ์†์„ ๋‹ค๋ฃฐ ๋•Œ, 1~2๊ฐœ์˜ ์‹œ๋„ˆ์ง€ ์ฐจ์›์œผ๋กœ ๋ณต์žกํ•œ ์›€์ง์ž„์„ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์€ ์‹ค์šฉ์  ์˜๋ฏธ๊ฐ€ ํฌ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์šฐ์ฃผ์„ ์ด๋‚˜ ์žฌ๋‚œ ๋กœ๋ด‡์— ํƒ‘์žฌ๋œ ๋‹ค์ค‘ ์† ์‹œ์Šคํ…œ์„ ๋‹จ์ผ ์‹œ๋„ˆ์ง€ ์ปจํŠธ๋กค๋Ÿฌ๋กœ ์กฐ์ž‘ํ•˜๋Š” ๋“ฑ, ์ž์œ ๋„ ํ™•์žฅ ๋ฌธ์ œ์— ๋Œ€ํ•œ ํ•ด๊ฒฐ์ฑ…์ด ๋  ์ˆ˜ ์žˆ๋‹ค.

์ข…ํ•ฉํ•˜๋ฉด, PCHands๋Š” PCA ๊ธฐ๋ฐ˜ ์‹œ๋„ˆ์ง€์™€ ์•ค์ปค ๊ธฐ๋ฐ˜ kinematic ๋งคํ•‘์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋กœ๋ด‡ ์กฐ์ž‘์˜ ๋ฒ”์šฉ ํ‘œํ˜„์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด ํ‘œํ˜„์€ ๊ธฐ๊ตฌํ•™์  ์ฐจ์ด๋ฅผ ํก์ˆ˜ํ•ด ๋‹ค์–‘ํ•œ ๋กœ๋ด‡์— ์ผ๊ด€๋œ ์ œ์–ด๋ฅผ ๊ฐ€๋Šฅ์ผ€ ํ•˜๋ฉฐ, ํŠนํžˆ ๊ฐ•ํ™”ํ•™์Šต๊ณผ ์‹œ์—ฐ ํ•™์Šต์—์„œ ์œ ์˜๋ฏธํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ณด์˜€๋‹ค. ๋ฌผ๋ก  ๋ฌผ์ฒด ์ƒํ˜ธ์ž‘์šฉ, ๋น„์„ ํ˜• ํŠน์„ฑ ๋ฐ˜์˜ ๋“ฑ ๋‚จ์€ ๊ณผ์ œ๊ฐ€ ์žˆ์ง€๋งŒ, PCHands๋Š” ๊ณ ์ž์œ ๋„ ๋กœ๋ด‡ ์ œ์–ด ๋ฐ ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜ ๋ถ„์•ผ์— ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ์„ ์—ด์—ˆ๋‹ค. ์•ž์œผ๋กœ ์‹ค์ œ ๋กœ๋ด‡ ์‹œ์Šคํ…œ์— ๋„๋ฆฌ ์ ์šฉ๋˜์–ด ๋กœ๋ด‡ ์† ์ œ์–ด์˜ ๋ณดํŽธ์  ํ”Œ๋žซํผ ๊ตฌ์ถ•์— ๊ธฐ์—ฌํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.

Copyright 2026, JungYeon Lee