Curieux.JY
  • JungYeon Lee
  • Post
  • Lecture
  • Note

On this page

  • ๐Ÿ” Ping Review
  • ๐Ÿ”” Ring Review
    • ์„œ๋ก 
      • ๋ฌธ์ œ ์ •์˜: ์™œ ์ด‰๊ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์€ โ€œํ˜„์‹ค์˜ ๋ฒฝโ€์— ๋ถ€๋”ชํžˆ๋Š”๊ฐ€
      • ๊ธฐ์กด ์ ‘๊ทผ์˜ ํ•œ๊ณ„
      • ํ•ต์‹ฌ ์•„์ด๋””์–ด ํ•œ ๋ฌธ์žฅ
    • ๋ฐฉ๋ฒ•
      • ํฐ ๊ทธ๋ฆผ: ์™œ ํ•˜ํ•„ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ โ€œ๊ฒฝ๋กœ ์ถ”์ โ€์ธ๊ฐ€
      • ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ ๊ฐœ์š”
      • ๋‹จ๊ณ„ 1 โ€” ์œ ๋ฆฌ ์—†๋Š” ํฌ์ฆˆ ์ตœ์ ํ™”
      • ๋‹จ๊ณ„ 2 โ€” ์œ ๋ฆฌ ๊ตด์ ˆ์„ ํฌํ•จํ•œ ํฌ์ฆˆ ์ •๋ฐ€ํ™”
      • ๋‹จ๊ณ„ 3 โ€” ์กฐ๋ช…๊ณผ ์žฌ์งˆ์˜ ๊ฒฐํ•ฉ ์ตœ์ ํ™”
      • ๋‹จ๊ณ„ 4 โ€” ๋งˆ์ปค ํ…์Šค์ฒ˜ ์ตœ์ ํ™”
      • ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ์„ฑ์ด ์—ฌ๋Š” ์—ญ๋ฌธ์ œ: ๋‹จ์ผ ์ด๋ฏธ์ง€์—์„œ ๋ฉ”์‹œ ๋ณต์›
      • ์†๋„ ๋ฌธ์ œ์™€ NOCS ๊ธฐ๋ฐ˜ Pix2Pix
    • ์‹คํ—˜
      • ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ
      • ์‹ค์ œ ๊ฒ€์ฆ ํ™˜๊ฒฝ
      • ์ •๋Ÿ‰ ๊ฒฐ๊ณผ: NOCS vs Depth ์ž…๋ ฅ
      • ์ •์„ฑ ๊ฒฐ๊ณผ
      • ์˜๋ฏธ
    • ๋น„ํŒ์  ๊ณ ์ฐฐ
      • ๊ฐ•์ 
      • ์•ฝ์ ๊ณผ ํ•œ๊ณ„
      • ๊ด€๋ จ ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต
    • ์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

๐Ÿ“ƒPath-Tracing Sim2Real Tactile

tactile
sim2real
rendering
Automatic Physically-Based Sim2Real for Tactile Images through Differentiable Path-Tracing Rendering
Published

May 19, 2026

  • Paper Link
  • Poster Link

๐Ÿ” Ping Review

๐Ÿ” Ping โ€” A light tap on the surface. Get the gist in seconds.


๐Ÿ”” Ring Review

๐Ÿ”” Ring โ€” An idea that echoes. Grasp the core and its value.

์„œ๋ก 

๋ฌธ์ œ ์ •์˜: ์™œ ์ด‰๊ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์€ โ€œํ˜„์‹ค์˜ ๋ฒฝโ€์— ๋ถ€๋”ชํžˆ๋Š”๊ฐ€

GelSight ๋ฅ˜์˜ ๋น„์ „ ๊ธฐ๋ฐ˜ ์ด‰๊ฐ ์„ผ์„œ(vision-based tactile sensor) ๋Š” ๋ถ€๋“œ๋Ÿฌ์šด ์ ค(gel) ํ‘œ๋ฉด์ด ๋ฌผ์ฒด์— ๋ˆŒ๋ฆด ๋•Œ ์ƒ๊ธฐ๋Š” ๋ฏธ์„ธํ•œ ๋ณ€ํ˜•์„ ๋‚ด๋ถ€ ์นด๋ฉ”๋ผ๋กœ ์ดฌ์˜ํ•ด, ์ ‘์ด‰ ์ง€์ ์˜ ํ˜•์ƒยทํ…์Šค์ฒ˜ยท์ „๋‹จ๋ ฅ(shear force)๊นŒ์ง€ ๊ณ ํ•ด์ƒ๋„๋กœ ์ฝ์–ด๋ƒ…๋‹ˆ๋‹ค. ๋กœ๋ด‡ ๋งค๋‹ˆํ“ฐ๋ ˆ์ด์…˜์—์„œ ๋ฏธ๋„๋Ÿฌ์ง ๊ฐ์ง€, ๋ฌผ์ฒด ์ธ์‹, ๊ฐ•ํ™”ํ•™์Šต ์ •์ฑ… ํ•™์Šต ๋“ฑ์— ๊ฐ•๋ ฅํ•˜๊ฒŒ ์“ฐ์ด์ง€๋งŒ, ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ•™์Šต์‹œํ‚ค๋ ค๋ฉด ๋Œ€๊ทœ๋ชจ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ ๋กœ๋ด‡์œผ๋กœ ์ˆ˜๋งŒ ์žฅ์„ ๋ชจ์œผ๋Š” ๊ฒƒ์€ ๋А๋ฆฌ๊ณ  ๋น„์‹ธ๊ธฐ ๋•Œ๋ฌธ์—, ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ โ€œ์‹œ๋ฎฌ๋ ˆ์ด์…˜์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฐ์–ด๋‚ด์žโ€๋Š” ํ๋ฆ„์ด ์ƒ๊ฒผ์Šต๋‹ˆ๋‹ค.

๋ฌธ์ œ๋Š” ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์ด๋ฏธ์ง€์™€ ์‹ค์ œ ์„ผ์„œ ์ด๋ฏธ์ง€ ์‚ฌ์ด์— ๋ˆ์งˆ๊ธฐ๊ฒŒ ๋‚จ๋Š” sim-to-real gap(์‹œ๋ฎฌ-์‹ค์ œ ๊ฒฉ์ฐจ) ์ž…๋‹ˆ๋‹ค. ์ด ๋…ผ๋ฌธ์€ ๊ทธ ๊ฒฉ์ฐจ์˜ ๊ฐ€์žฅ ํฐ ์›์ธ์„ ๋‘ ๊ฐ€์ง€๋กœ ์ง€๋ชฉํ•ฉ๋‹ˆ๋‹ค.

  1. ๋ณต์žกํ•œ ๊ด‘ํ•™(optics) ํšจ๊ณผ ๋ชจ๋ธ๋ง์˜ ์–ด๋ ค์›€ โ€” ํŠนํžˆ ์„ผ์„œ๋ฅผ ๋ณดํ˜ธํ•˜๋Š” ์œ ๋ฆฌ(glass) ์ธต์„ ๋น›์ด ํ†ต๊ณผํ•  ๋•Œ ์ผ์–ด๋‚˜๋Š” ๊ตด์ ˆ(refraction). ๋น›์ด ๋งค์งˆ ๊ฒฝ๊ณ„๋ฅผ ์ง€๋‚˜๋ฉฐ ๊บพ์ด๊ธฐ ๋•Œ๋ฌธ์—, ์ ค ์•ˆ์— ๋ฐ•ํžŒ ๋งˆ์ปค(marker, ๊ฒ€์€ ์ )์˜ ๊ฒ‰๋ณด๊ธฐ ์œ„์น˜์™€ ํฌ๊ธฐ๊ฐ€ ์™œ๊ณก๋ฉ๋‹ˆ๋‹ค. ์นด๋ฉ”๋ผ๊ฐ€ ๋ณด๋Š” ๋งˆ์ปค๋Š” ์‹ค์ œ ๋งˆ์ปค์˜ ์œ„์น˜๊ฐ€ ์•„๋‹ˆ๋ผ โ€œ์œ ๋ฆฌ๋ฅผ ํ†ต๊ณผํ•ด ๊ตด์ ˆ๋œโ€ ์œ„์น˜์ž…๋‹ˆ๋‹ค.
  2. ๋ฌผ๋ฆฌ ํŒŒ๋ผ๋ฏธํ„ฐ ์ถ”์ •์˜ ์–ด๋ ค์›€ โ€” ์„ผ์„œ์˜ ์นด๋ฉ”๋ผ ํฌ์ฆˆ(camera pose)์™€ ์กฐ๋ช…(lighting) ์กฐ๊ฑด. ์ƒ์šฉ GelSight Mini์กฐ์ฐจ ์ œ์กฐ ๊ณต์ฐจ(manufacturing tolerance) ๋•Œ๋ฌธ์— ๊ฐœ์ฒด๋งˆ๋‹ค ์นด๋ฉ”๋ผ ์œ„์น˜๊ฐ€ ๋ฏธ๋ฌ˜ํ•˜๊ฒŒ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์€ ๋ณดํ†ต ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋ฅผ ์ด๋ฏธ ์•Œ๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์กด ์ ‘๊ทผ์˜ ํ•œ๊ณ„

์ด‰๊ฐ ๋ Œ๋”๋ง ๊ธฐ์ˆ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ง„ํ™”ํ•ด ์™”์Šต๋‹ˆ๋‹ค.

  • ๊ฐ•์ฒด(rigid-body) + ๊นŠ์ด ์นด๋ฉ”๋ผ + ๊ฐ€์šฐ์‹œ์•ˆ ์Šค๋ฌด๋”ฉ: ๋น ๋ฅด์ง€๋งŒ ์ˆ˜์ง๋ ฅ๋งŒ ์žก๊ณ , ์ „๋‹จ๋ ฅ์œผ๋กœ ์ธํ•œ ๋งˆ์ปค ์ด๋™์„ ๋†“์นฉ๋‹ˆ๋‹ค (TACTO, Taxim ๋“ฑ).
  • ๋ณ€ํ˜•์ฒด(deformable) ์‹œ๋ฎฌ๋ ˆ์ด์…˜ (FEM, MPM): ์ ค์„ ์†Œํ”„ํŠธ๋ฐ”๋””๋กœ ๋ชจ๋ธ๋งํ•ด ๋งˆ์ปค ๋ณ€์œ„๊นŒ์ง€ ์ถ”์ . ์ถฉ์‹ค๋„๋Š” ๋†’์•„์ง (TacFlex ๋“ฑ).
  • ๋ Œ๋”๋ง ์ธก๋ฉด: ๋‹จ์ˆœ ๋ž˜์Šคํ„ฐํ™”(rasterization) + ์ˆ˜๋™ LED ํšจ๊ณผ โ†’ ๊ทธ๋ฆผ์ž์šฉ ๋ ˆ์ด ํŠธ๋ ˆ์ด์‹ฑ โ†’ Snell ๋ฒ•์น™ ๊ธฐ๋ฐ˜ ์‚ฌํ›„ ๋ณด์ •(post-hoc correction). ๋ชจ๋‘ ์ˆ˜๋™ ํŠœ๋‹์ด ํ•„์š”ํ•˜๊ณ  ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋ฅผ ์•Œ๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.

์ด ๋…ผ๋ฌธ์˜ ์ถœ๋ฐœ์ ์€ ๋ช…ํ™•ํ•ฉ๋‹ˆ๋‹ค. โ€œ๊ตด์ ˆยท์กฐ๋ช…ยทํฌ์ฆˆ๋ฅผ ๋ณ„๋„ ๋ณด์ •์œผ๋กœ ๋•œ์งˆํ•˜์ง€ ๋ง๊ณ , ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ๊ด‘ํ•™ ๋ชจ๋ธ ์•ˆ์— ์ฒ˜์Œ๋ถ€ํ„ฐ ๋„ฃ๊ณ , ์‹ค์ œ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ์ž๋™์œผ๋กœ ์ตœ์ ํ™”ํ•˜๋ฉด ์–ด๋–จ๊นŒ?โ€ ๊ทธ ์—ด์‡ ๊ฐ€ ๋ฐ”๋กœ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง(differentiable rendering) ์ž…๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์•„์ด๋””์–ด ํ•œ ๋ฌธ์žฅ

๋‹จ 3์žฅ์˜ ์‹ค์ œ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ, ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๊ฒฝ๋กœ ์ถ”์  ๋ Œ๋”๋Ÿฌ(Mitsuba 3)๋ฅผ ํ†ตํ•ด ์นด๋ฉ”๋ผ ํฌ์ฆˆยท์กฐ๋ช…ยทํ…์Šค์ฒ˜ ๊ฐ™์€ ๋ฌผ๋ฆฌ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฒฝ์‚ฌํ•˜๊ฐ•(gradient descent)์œผ๋กœ ์ž๋™ ์ตœ์ ํ™”ํ•ด, ์œ ๋ฆฌ ๊ตด์ ˆ๊นŒ์ง€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์ •ํ™•ํ•œ ์ด‰๊ฐ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

์ €์ž๋“ค์ด ๋‚ด์„ธ์šฐ๋Š” ๊ธฐ์—ฌ๋Š” ๋„ค ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค.

  • ๋น„์ „ ์ด‰๊ฐ ์„ผ์„œ๋ฅผ ์œ„ํ•œ ์ตœ์ดˆ์˜ ์™„์ „ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง ํŒŒ์ดํ”„๋ผ์ธ โ€” ํฌ์ฆˆยทํ…์Šค์ฒ˜ยท์กฐ๋ช…์„ ์‹ค์ œ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๊ฒฝ์‚ฌ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™”.
  • ์œ ๋ฆฌ ๊ตด์ ˆ๊ณผ RGB ์กฐ๋ช… ๊ฐ™์€ ํ•ต์‹ฌ ๊ด‘ํ•™ ํšจ๊ณผ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋Š” ์ตœ์ ํ™” ๊ธฐ๋ฒ•.
  • NOCS ๋งต์„ ์ž…๋ ฅ์œผ๋กœ ํ•˜๋Š” ์ด๋ฏธ์ง€-ํˆฌ-์ด๋ฏธ์ง€ ๋ณ€ํ™˜(Pix2Pix) ์œผ๋กœ ๊ณ ์ถฉ์‹ค๋„ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ƒ์„ฑํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ.
  • ์‹ค์ œ ๋กœ๋ด‡ ํ”Œ๋žซํผ์—์„œ์˜ ๊ด‘๋ฒ”์œ„ํ•œ ๊ฒ€์ฆ ๋ฐ ๋‹จ์ผ ์ด‰๊ฐ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ํ˜•์ƒ์„ ๋ณต์›ํ•˜๋Š” ์—ญ๋ฌธ์ œ(inverse problem) ์‘์šฉ.

๋ฐฉ๋ฒ•

ํฐ ๊ทธ๋ฆผ: ์™œ ํ•˜ํ•„ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ โ€œ๊ฒฝ๋กœ ์ถ”์ โ€์ธ๊ฐ€

๋จผ์ € ์ง๊ด€๋ถ€ํ„ฐ ์žก๊ฒ ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ๋ Œ๋”๋ง์€ โ€œ์žฅ๋ฉด(scene) โ†’ ์ด๋ฏธ์ง€โ€๋กœ ๊ฐ€๋Š” ์ˆœ๋ฐฉํ–ฅ(forward) ๊ณ„์‚ฐ์ž…๋‹ˆ๋‹ค. ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง์€ ์—ฌ๊ธฐ์— ๋”ํ•ด โ€œ์ด๋ฏธ์ง€์˜ ์˜ค์ฐจ โ†’ ์žฅ๋ฉด ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ฐ”๊ฟ”์•ผ ์˜ค์ฐจ๊ฐ€ ์ค„์–ด๋“œ๋Š”์ง€โ€๋ผ๋Š” ์—ญ๋ฐฉํ–ฅ(backward) ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰ ๋ Œ๋”๋Ÿฌ ์ž์ฒด๊ฐ€ ๊ฑฐ๋Œ€ํ•œ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ํ•จ์ˆ˜๊ฐ€ ๋˜์–ด, ์‹ ๊ฒฝ๋ง์„ ํ•™์Šตํ•˜๋“ฏ ์žฅ๋ฉด์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ Œ๋”๋ง์˜ ๊ทผ๋ณธ ์‹์ธ ๋ Œ๋”๋ง ๋ฐฉ์ •์‹(rendering equation) ์„ ํŒŒ๋ผ๋ฏธํ„ฐ \theta ๋กœ ๋ฏธ๋ถ„ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค (๋…ผ๋ฌธ ์‹ 1).

\frac{\partial I}{\partial \theta} = \frac{\partial}{\partial \theta} \int_{\Omega} L_i(\omega_i)\, f_r(\omega_i, \omega_o)\, (n \cdot \omega_i)\, d\omega_i

์ง๊ด€์ ์œผ๋กœ ํ’€์–ด ๋ณด๋ฉด:

  • L_i(\omega_i): ๋ฐฉํ–ฅ \omega_i ์—์„œ ๋“ค์–ด์˜ค๋Š” ์ž…์‚ฌ๊ด‘(์กฐ๋ช…).
  • f_r(\omega_i, \omega_o): BSDF(์–‘๋ฐฉํ–ฅ ์‚ฐ๋ž€ ๋ถ„ํฌ ํ•จ์ˆ˜). ํ‘œ๋ฉด์ด ๋น›์„ ์–ด๋–ป๊ฒŒ ๋ฐ˜์‚ฌ/๊ตด์ ˆ์‹œํ‚ค๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ„. ์œ ๋ฆฌ์˜ ๊ตด์ ˆยท์ •๋ฐ˜์‚ฌ(specular)๊ฐ€ ์—ฌ๊ธฐ ๋“ค์–ด๊ฐ‘๋‹ˆ๋‹ค.
  • (n \cdot \omega_i): ํ‘œ๋ฉด ๋ฒ•์„ ๊ณผ ๋น› ๋ฐฉํ–ฅ์˜ ๊ฐ๋„ ํ•ญ(๋ž€๋ฒ ๋ฅดํŠธ ์ฝ”์‚ฌ์ธ).
  • \theta: ์ตœ์ ํ™”ํ•˜๋ ค๋Š” ์ž„์˜์˜ ๋ Œ๋” ํŒŒ๋ผ๋ฏธํ„ฐ โ€” ์žฌ์งˆ(material), ๊ธฐํ•˜(geometry), ์กฐ๋ช…(lighting).

ํ•ต์‹ฌ์€ ์ด ์ ๋ถ„ ์ „์ฒด๋ฅผ \theta์— ๋Œ€ํ•ด ๋ฏธ๋ถ„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด โ€œ๋ Œ๋” ์ด๋ฏธ์ง€ I๊ฐ€ ์‹ค์ œ ์ด๋ฏธ์ง€์™€ ์–ผ๋งˆ๋‚˜ ๋‹ค๋ฅธ์ง€โ€๋ฅผ ์†์‹ค๋กœ ์ •์˜ํ•˜๊ณ , ๊ทธ ์†์‹ค์„ ์ค„์ด๋Š” ๋ฐฉํ–ฅ์œผ๋กœ \theta๋ฅผ ๊ฐฑ์‹ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ €์ž๋“ค์€ ์ผ๋ฐ˜ ์ปดํ“จํ„ฐ ๋น„์ „์—์„œ๋Š” ๋ณต์žกํ•œ ๊ธฐํ•˜ยท์‹ฌํ•œ ๊ฐ€๋ฆผ(occlusion)์ด ๊ธฐ์šธ๊ธฐ์— ๋ถˆ์—ฐ์†์„ ๋งŒ๋“ค์–ด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง์ด ์–ด๋ ต์ง€๋งŒ, ์ด‰๊ฐ ์ด๋ฏธ์ง€๋Š” ๋งค๋„๋Ÿฌ์šด ๋ณ€ํ˜• ํ‘œ๋ฉด๊ณผ ์—ฐ์†์ ์ธ ์ ‘์ด‰ ์˜์—ญ์ด๋ผ๋Š” ํŠน์„ฑ ๋•๋ถ„์— ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ์ด ์•ˆ์ •์ ์ด๋ผ๊ณ  ์ง€์ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰ ์ด ๋„๋ฉ”์ธ์ด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง๊ณผ โ€œ๊ถํ•ฉ์ด ์ข‹๋‹คโ€๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•œ ํ†ต์ฐฐ์ž…๋‹ˆ๋‹ค.

Mitsuba 3๊ฐ€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋Š” ์„ธ ๊ฐ€์ง€ ํ•ต์‹ฌ:

  1. ์œ ๋ฆฌ ๋ณดํ˜ธ์ธต์„ ํ†ตํ•œ ๊ด‘ ์ˆ˜์†ก โ€” ๊ตด์ ˆ๊ณผ ์ •๋ฐ˜์‚ฌ๋ฅผ ํฌํ•จํ•œ ์ •ํ™•ํ•œ BSDF ๋ชจ๋ธ๋ง.
  2. ๊ฒฝ๋กœ ์ถ”์ ์„ ํ†ตํ•œ ์ž์—ฐ์Šค๋Ÿฌ์šด ๊ทธ๋ฆผ์ž โ€” ์ „์—ญ ์กฐ๋ช…(global illumination) ํšจ๊ณผ ํฌ์ฐฉ.
  3. ๊ธฐ์šธ๊ธฐ๋ฅผ ํ†ตํ•œ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™”.

์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ ๊ฐœ์š”

flowchart TD
    A[3 real images:<br/>texture ref + cross-pattern + tactile] --> B[Stage 1:<br/>Pose Optimization<br/>without glass]
    B --> C[Stage 2:<br/>Pose Refinement<br/>WITH glass refraction]
    C --> D[Joint Lighting &<br/>Material Optimization]
    D --> E[Marker Texture<br/>Optimization]
    E --> F[Calibrated<br/>Differentiable Renderer]
    F --> G1[High-fidelity<br/>RGB + NOCS dataset<br/>56k samples]
    F --> G2[Inverse rendering:<br/>mesh reconstruction<br/>from single image]
    G1 --> H[Pix2Pix:<br/>NOCS map -> RGB<br/>real-time inference]
    style C fill:#ffe0b2
    style F fill:#c8e6c9
    style H fill:#bbdefb

์ด ํŒŒ์ดํ”„๋ผ์ธ์€ ํ”„๋ก ํŠธ์—”๋“œ(๋ณ€ํ˜• ๋ฌผ๋ฆฌ)์™€ ๋ฐฑ์—”๋“œ(๊ด‘ํ•™ ๋ Œ๋”๋ง)๋กœ ๋‚˜๋ˆ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ณ€ํ˜• ๋ฌผ๋ฆฌ๋Š” FEM ๊ธฐ๋ฐ˜ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ TacFlex๋ฅผ ํ™•์žฅํ•ด ์‚ฌ์šฉํ•˜๊ณ (์†๋„ยท์ •ํ™•๋„ยท์ „์ด ์„ฑ๋Šฅ์˜ ๊ท ํ˜•์ด ์ข‹๋‹ค๊ณ  ํ‰๊ฐ€), ๊ทธ ์œ„์— ์ƒˆ๋กœ์šด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง ์ปดํฌ๋„ŒํŠธ๋ฅผ ์–น์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ์€ ๋‹จ ์„ธ ๊ฐ€์ง€: โ‘  ๋‹จ์ผ ์‹ค์ œ ์ด‰๊ฐ ์ด๋ฏธ์ง€, โ‘ก ์„ผ์„œ์˜ ๋ฉ”์‹œ ๋ชจ๋ธ, โ‘ข ์•Œ๋ ค์ง„ ๋งˆ์ปค ๊ฒฉ์ž ์‚ฌ์–‘(9ร—7 ๋ฐฐ์—ด, 2mmร—2.05mm ๊ฐ„๊ฒฉ, ์ง€๋ฆ„ 1mm).

๋‹จ๊ณ„ 1 โ€” ์œ ๋ฆฌ ์—†๋Š” ํฌ์ฆˆ ์ตœ์ ํ™”

๋จผ์ € ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋ฅผ ๊ฑฐ์น ๊ฒŒ ์žก์Šต๋‹ˆ๋‹ค. ๋งˆ์ปค ๊ฒฉ์ž ์‚ฌ์–‘์œผ๋กœ๋ถ€ํ„ฐ ์™„๋ฒฝํ•œ ์ด์ง„ ํ…์Šค์ฒ˜(binary texture)๋ฅผ ์ƒ์„ฑํ•˜๊ณ , ํ…์Šค์ฒ˜๊ฐ€ ์ž…ํ˜€์ง„ ๋ฉ”์‹œ๋ฅผ ๋ฌด์ž‘์œ„ ํฌ์ฆˆ์— ๋†“์€ ๋’ค ์ตœ์ ํ™”๋ฅผ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์†์‹ค์€ ๋‹จ์ˆœํ•œ ๋งˆ์ปค ์œ„์น˜ ์˜ค์ฐจ(๋…ผ๋ฌธ ์‹ 2)์ž…๋‹ˆ๋‹ค.

L_{\text{pose}} = \sum_{j=1}^{N} \left[ (u_j - u_j^{\text{ref}})^2 + (v_j - v_j^{\text{ref}})^2 \right]

  • (u_j, v_j): ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ 3Dโ†’2D ํˆฌ์˜(differentiable projection)์œผ๋กœ ์–ป์€ ๋งˆ์ปค์˜ ํ™”๋ฉด ์ขŒํ‘œ.
  • (u_j^{\text{ref}}, v_j^{\text{ref}}): ์‹ค์ œ ์ด๋ฏธ์ง€์—์„œ ์›ํ˜• ํ—ˆํ”„ ๋ณ€ํ™˜(circle Hough transform) ์œผ๋กœ ๊ฒ€์ถœํ•œ ๋งˆ์ปค ์ขŒํ‘œ.

์ง๊ด€: ๊ฐ€์ƒ ๋งˆ์ปค๋“ค์ด ์‹ค์ œ ์ด๋ฏธ์ง€์˜ ๋งˆ์ปค ์œ„์น˜์— ๊ฒน์ณ์ง€๋„๋ก ๋ฉ”์‹œ ํฌ์ฆˆ๋ฅผ ๋Œ์–ด๋‹น๊ธฐ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ๋‹จ๊ณ„๋Š” ํ•€ํ™€ ์นด๋ฉ”๋ผ ๋ชจ๋ธ(pinhole camera model) ๋งŒ ์“ฐ๋ฏ€๋กœ ๋น ๋ฅด์ง€๋งŒ, ์œ ๋ฆฌ ๊ตด์ ˆ์€ ๋ฌด์‹œํ•ฉ๋‹ˆ๋‹ค.

๋‹จ๊ณ„ 2 โ€” ์œ ๋ฆฌ ๊ตด์ ˆ์„ ํฌํ•จํ•œ ํฌ์ฆˆ ์ •๋ฐ€ํ™”

์—ฌ๊ธฐ๊ฐ€ ์ด ๋…ผ๋ฌธ์˜ ๋ฐฑ๋ฏธ์ž…๋‹ˆ๋‹ค. ๋‹จ์ˆœ ํ•€ํ™€ ํˆฌ์˜์€ ์œ ๋ฆฌ ๊ตด์ ˆ์„ ํ‘œํ˜„ํ•˜์ง€ ๋ชปํ•ด ๋งˆ์ปค๊ฐ€ ์–ด๊ธ‹๋‚ฉ๋‹ˆ๋‹ค(๋…ผ๋ฌธ Fig. 2 ์™ผ์ชฝ). ๊ทธ๋ž˜์„œ Mitsuba์˜ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋ง์œผ๋กœ ๋งˆ์ปค ์œ„์น˜๋ฅผ ๋ฌผ๋ฆฌ์  ๋ ˆ์ด ํŠธ๋ ˆ์ด์‹ฑ์„ ํ†ตํ•ด ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์†์‹ค์€ ๋งˆ์ปค ์˜์—ญ์—๋งŒ ์ง‘์ค‘ํ•˜๋Š” ๋งˆ์Šคํฌ๋“œ MSE(๋…ผ๋ฌธ ์‹ 3)์ž…๋‹ˆ๋‹ค.

L_{\text{stage2}} = \sum_{i=1}^{N} M_i \cdot (I_{\text{rendered},i} - I_{\text{ref},i})^2

  • M_i: ํ”ฝ์…€ i์˜ ์ด์ง„ ๋งˆ์Šคํฌ ๊ฐ’. ์›ํ˜• ๊ฒ€์ถœ๋กœ ๋งŒ๋“ค์–ด ๋งˆ์ปค ์˜์—ญ๋งŒ ๊ธฐ์šธ๊ธฐ์— ๊ธฐ์—ฌํ•˜๊ฒŒ ํ•จ.
  • I_{\text{rendered},i}: ์œ ๋ฆฌ ๊ตด์ ˆ ํšจ๊ณผ๊ฐ€ ๋ฐ˜์˜๋œ ๋ Œ๋” ํ”ฝ์…€ ๊ฐ’.

์ง๊ด€: ๋‹จ๊ณ„ 1๊ณผ ๊ฐ™์€ โ€œ๋งˆ์ปค๋ฅผ ๋งž์ถ˜๋‹คโ€๋Š” ๋ชฉํ‘œ๋Š” ๊ฐ™์ง€๋งŒ, ์ด์ œ๋Š” ๋‹จ์ˆœ ํˆฌ์˜์ด ์•„๋‹ˆ๋ผ ์œ ๋ฆฌ๋ฅผ ํ†ต๊ณผํ•ด ๊ตด์ ˆ๋œ ๋น› ๊ฒฝ๋กœ๋กœ ๋งˆ์ปค๊ฐ€ ์–ด๋””์— ๋ณด์ด๋Š”์ง€๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์œ„์น˜๋ฟ ์•„๋‹ˆ๋ผ ๊ตด์ ˆ๋กœ ์ƒ๊ธฐ๋Š” ํฌ๊ธฐ(scale) ์™œ๊ณก๊นŒ์ง€ ์ œ๊ฑฐ๋˜์–ด, ์‹ค์ œ ์„ผ์„œ์™€ ๊ฑฐ์˜ ์™„๋ฒฝํžˆ ์ •๋ ฌ๋ฉ๋‹ˆ๋‹ค(๋…ผ๋ฌธ Fig. 2 ์˜ค๋ฅธ์ชฝ: ํฐ์ƒ‰=์ •ํ™•ํžˆ ์ผ์น˜, ์‹œ์•ˆ=๋ˆ„๋ฝ๋œ ๋งˆ์ปค, ๋งˆ์  ํƒ€=์ดˆ๊ณผ ๋งˆ์ปค).

flowchart LR
    subgraph Stage1["Stage 1 (pinhole)"]
        P1[Marker 3D pos] -->|simple projection| P2[2D markers]
        P2 -.misaligned.-> P3[reference markers]
    end
    subgraph Stage2["Stage 2 (path tracing)"]
        Q1[Marker 3D pos] -->|ray through glass<br/>refraction| Q2[2D markers]
        Q2 -->|aligned + scaled| Q3[reference markers]
    end
    Stage1 ==> Stage2

๋‹จ๊ณ„ 3 โ€” ์กฐ๋ช…๊ณผ ์žฌ์งˆ์˜ ๊ฒฐํ•ฉ ์ตœ์ ํ™”

๋‹ค์Œ์€ ์กฐ๋ช…์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์ด ๋ฐฐ๊ฒฝ ํ™˜๊ฒฝ๋งต์œผ๋กœ ์กฐ๋ช…์„ ๊ทผ์‚ฌํ•œ ๊ฒƒ๊ณผ ๋‹ฌ๋ฆฌ, ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์ •ํ™•ํ•œ ๊ด‘์›์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ ์„ผ์„œ ๊ตฌ์กฐ์— ๋งž์ถฐ ์„ผ์„œ ๋‘˜๋ ˆ์— 3๊ฐœ์˜ ์ง์‚ฌ๊ฐํ˜• ๋ฉด๊ด‘์›(area light) ์„ ๋ฐฐ์น˜ํ•˜๊ณ , ๊ท ์ผํ•œ ์ƒ‰์˜ ๋ฉ”์‹œ๋ฅผ ๋‘ก๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‘ ์žฅ์˜ ์ฐธ์กฐ ์ด๋ฏธ์ง€๋ฅผ ์“ฐ๋Š” ์ด์ค‘ ์ฐธ์กฐ(dual-reference) ์ „๋žต์„ ์”๋‹ˆ๋‹ค.

  • ํ…์Šค์ฒ˜ ์ฐธ์กฐ ์ด๋ฏธ์ง€(๋ณ€ํ˜• ์—†์Œ): ์„ธ ๊ด‘์›์˜ ์ง‘ํ•ฉ์  ์กฐ๋ช… ํšจ๊ณผ๋ฅผ ๊ณ ๋ คํ•ด ๋ฉ”์‹œ ๋ณธ์—ฐ์˜ ์ƒ‰(์žฌ์งˆ)์„ ๋ณต์›.
  • ์‹ญ์ž(cross) ํŒจํ„ด ๋ณ€ํ˜• ์ด๋ฏธ์ง€: ์„ธ ๋ฐฉํ–ฅ์˜ ์ธก๋ฉด ์กฐ๋ช… ํŠน์„ฑ์„ ๋™์‹œ์— ๋“œ๋Ÿฌ๋‚ด ๊ด‘์› ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ •ํ™•ํžˆ ์ถ”์ •. ์‹ญ์ž ํŒจํ„ด์„ ๊ณ ๋ฅธ ์ด์œ ๋Š” ์ž„์˜ ํ˜•์ƒยท๋ณ€ํ˜•์— ์ผ๋ฐ˜ํ™”๋˜๋Š” ์กฐ๋ช… ์ •๋ณด๋ฅผ ์ฃผ๊ธฐ ๋•Œ๋ฌธ.

๋™์‹œ์— ์ตœ์ ํ™”ํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ: ์„ธ ๊ด‘์› ๊ฐ๊ฐ์˜ ์ƒ‰๊ณผ ๊ฐ•๋„, ๋ฉ”์‹œ์˜ ํ™•์‚ฐ ๋ฐ˜์‚ฌ์œจ(diffuse reflectance, ์ƒ‰).

์ „์ฒด ์†์‹ค์€ ๋‘ ์ฐธ์กฐ์˜ ๊ฐ€์ค‘ํ•ฉ(๋…ผ๋ฌธ ์‹ 4)์ž…๋‹ˆ๋‹ค.

L_{\text{total}} = \alpha \cdot L_{\text{light}} + \beta \cdot L_{\text{mesh}}

๊ฐ€์ค‘์น˜๋Š” ๋Œ€๋žต \alpha = 0.3,\ \beta = 0.7. ๊ฐ ํ•ญ์€(์‹ 5, 6):

L_{\text{light}} = \frac{1}{N}\sum_{i=1}^{N} W_i \cdot \big(T(I_{\text{rendered},i}) - T(I_{\text{ref-light},i})\big)^2

L_{\text{mesh}} = \frac{1}{N}\sum_{i=1}^{N} \big(T(I_{\text{rendered},i}) - T(I_{\text{ref-mesh},i})\big)^2

์—ฌ๊ธฐ์„œ T(x) = x/(x+1.0) ๋Š” ํ†ค ๋งคํ•‘(tone mapping) ์—ฐ์‚ฐ์ž์ž…๋‹ˆ๋‹ค. ์ง๊ด€์ ์œผ๋กœ๋Š” ๋ฐ์€ ์˜์—ญ์˜ ๊ฐ’์„ ์••์ถ•ํ•ด(HDRโ†’LDR ๋น„์Šทํ•œ ํšจ๊ณผ) ๊ฐ•ํ•œ ํ•˜์ด๋ผ์ดํŠธ๊ฐ€ ์†์‹ค์„ ์ง€๋ฐฐํ•˜์ง€ ์•Š๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. W_i ๋Š” ์žฌ๊ตฌ์„ฑ ์˜ค์ฐจ๊ฐ€ ํฐ ์˜์—ญ๊ณผ ๊ฐ•ํ•œ ์กฐ๋ช… ์˜์—ญ์„ ๊ฐ•์กฐํ•˜๋Š” ์ ์‘ํ˜• ๊ฐ€์ค‘์น˜ ๋งต์ž…๋‹ˆ๋‹ค.

๋‹จ๊ณ„ 4 โ€” ๋งˆ์ปค ํ…์Šค์ฒ˜ ์ตœ์ ํ™”

ํฌ์ฆˆยท๋ฉ”์‹œ ์ƒ‰ยท์กฐ๋ช…์ด ์žกํžˆ๋ฉด, ๋งˆ์ง€๋ง‰์œผ๋กœ ๋งˆ์ปค ํ…์Šค์ฒ˜ ์ž์ฒด๋ฅผ ๋‹ค๋“ฌ์Šต๋‹ˆ๋‹ค. ์†์‹ค์€ ๋งˆ์ปค ํ”ฝ์…€์—๋งŒ ์ ์šฉ๋˜๋Š” ์ •๊ทœํ™”๋œ MSE(๋…ผ๋ฌธ ์‹ 7)์ž…๋‹ˆ๋‹ค.

L_{\text{marker}} = \frac{1}{N_{\text{markers}}} \sum_{i \in M} (I_{\text{rendered},i} - I_{\text{ref},i})^2

M ์€ ๋งˆ์ปค ์˜์—ญ ํ”ฝ์…€ ์ง‘ํ•ฉ, N_{\text{markers}} ๋Š” ๋งˆ์ปค ํ”ฝ์…€ ์ด ๊ฐœ์ˆ˜. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋งˆ์ปค๊ฐ€ ์ •ํ™•ํžˆ ๋ฐ•ํžŒ ํ…์Šค์ฒ˜๊ฐ€ ์–ป์–ด์ง‘๋‹ˆ๋‹ค.

๋ฏธ๋ถ„ ๊ฐ€๋Šฅ์„ฑ์ด ์—ฌ๋Š” ์—ญ๋ฌธ์ œ: ๋‹จ์ผ ์ด๋ฏธ์ง€์—์„œ ๋ฉ”์‹œ ๋ณต์›

๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ํŒŒ์ดํ”„๋ผ์ธ์˜ ๋˜ ๋‹ค๋ฅธ ๋งค๋ ฅ์€ ์—ญ๋ Œ๋”๋ง(inverse rendering) ์ž…๋‹ˆ๋‹ค. ๋ณด์ •์ด ๋๋‚œ ๋ Œ๋”๋Ÿฌ๊ฐ€ ์žˆ์œผ๋ฉด, โ€œ์•Œ๋ ค์ง„ ๋ฉ”์‹œ โ†’ RGBโ€๊ฐ€ ์•„๋‹ˆ๋ผ ๊ฑฐ๊พธ๋กœ โ€œ๊ด€์ธก๋œ RGB โ†’ ๋ฉ”์‹œ ์ •์  ์œ„์น˜โ€๋ฅผ ํ’€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(๋…ผ๋ฌธ ์‹ 8).

V^* = \arg\min_{V} \; L\big(I_{\text{render}}(V),\, I_{\text{real}}\big)

V ๋Š” ๋ฉ”์‹œ ์ •์  ์œ„์น˜, I_{\text{render}}(V) ๋Š” ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋Ÿฌ์˜ ์ถœ๋ ฅ, I_{\text{real}} ์€ ์‹ค์ œ ์ด‰๊ฐ ์ด๋ฏธ์ง€. ๊ธฐ์šธ๊ธฐ๊ฐ€ ๊ด‘ ์ˆ˜์†กยท๊ตด์ ˆยทํ‘œ๋ฉด ์ƒํ˜ธ์ž‘์šฉ ์ „์ฒด๋ฅผ ๊ฑฐ๊พธ๋กœ ํ˜๋Ÿฌ ์ •์ ์„ ๊ฐฑ์‹ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰ ๋‹จ ํ•œ ์žฅ์˜ ์ด‰๊ฐ ์ด๋ฏธ์ง€์—์„œ ๋ณ€ํ˜•๋œ 3D ํ˜•์ƒ์„ ๋ณต์›ํ•˜๋Š”, ์ด‰๊ฐ ๊ธฐ๋ฐ˜ ํ˜•์ƒ ์ธ์‹ยท๋งค๋‹ˆํ“ฐ๋ ˆ์ด์…˜์— ์œ ์šฉํ•œ ๋Šฅ๋ ฅ์ž…๋‹ˆ๋‹ค.

์†๋„ ๋ฌธ์ œ์™€ NOCS ๊ธฐ๋ฐ˜ Pix2Pix

๋ฌผ๋ฆฌ ์ •ํ™•๋„์˜ ๋Œ€๊ฐ€๋Š” ์—ฐ์‚ฐ ๋น„์šฉ์ž…๋‹ˆ๋‹ค. ๊ตด์ ˆ ๋งค์งˆ์„ ํ†ต๊ณผํ•˜๋Š” ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๊ฒฝ๋กœ ์ถ”์ ์€ ๋ž˜์Šคํ„ฐํ™”๋‚˜ ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋ณด๋‹ค ํ›จ์”ฌ ๋А๋ฆฝ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ณด์ •๋œ ๊ณ ์ถฉ์‹ค๋„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์ถœ๋ ฅ์„ ๊ต์‚ฌ ๋ฐ์ดํ„ฐ(proxy for real data) ๋กœ ์‚ผ์•„, ๋น ๋ฅธ Pix2Pix ์ด๋ฏธ์ง€ ๋ณ€ํ™˜ ๋ชจ๋ธ์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์„ ํƒ์€ ์ž…๋ ฅ ํ‘œํ˜„์ž…๋‹ˆ๋‹ค. ํ”ํ•œ ๊นŠ์ด ๋งต(depth map) ๋Œ€์‹  NOCS(Normalized Object Coordinate Space) ๋งต์„ ์ž…๋ ฅ์œผ๋กœ ์”๋‹ˆ๋‹ค.

  • ๊นŠ์ด ๋งต์˜ ํ•œ๊ณ„: ์ ‘์ด‰ ๊ธฐํ•˜(์ˆ˜์ง ์ •๋ณด)๋งŒ ๋‹ด๊ณ , ์ „๋‹จยทํšŒ์ „์œผ๋กœ ์ธํ•œ ๋งˆ์ปค ๋ณ€์œ„ ์ •๋ณด๋ฅผ ๋‹ด์ง€ ๋ชปํ•จ. 6D ํฌ์ฆˆ(๋ณ‘์ง„+ํšŒ์ „) ํ‘œํ˜„ ๋ถˆ๊ฐ€.
  • NOCS์˜ ๊ฐ•์ : ํ”ฝ์…€๋ณ„ ๋ฌผ์ฒด ๊ธฐํ•˜์™€ ํฌ์ฆˆ๋ฅผ ์ธ์ฝ”๋”ฉ. FEM ๋ฉ”์‹œ ๋ณ€ํ˜•์—์„œ ์ง์ ‘ ๋ Œ๋”๋ง๋˜์–ด ๋ณ‘์ง„ยทํšŒ์ „ ๋ณ€ํ˜•์„ ์•”๋ฌต์ ์œผ๋กœ ๋‹ด์•„๋ƒ„.

์„ ํƒ ์ด์œ  โ€” ํ™•์‚ฐ ๋ชจ๋ธ(diffusion)๋„ ์‹œ๋„๋˜์—ˆ์ง€๋งŒ ์ถ”๋ก ์ด ๋„ˆ๋ฌด ๋А๋ฆฌ๊ณ , CycleGAN(๋น„์ง์ง€์€ ๋ฐ์ดํ„ฐ)๊ณผ ๋‹ฌ๋ฆฌ ์—ฌ๊ธฐ์„œ๋Š” ๊ณ ์ถฉ์‹ค๋„ ์‹œ๋ฎฌ์„ ์‹ค๋ฐ์ดํ„ฐ์˜ ์ง์ ‘ ๋Œ€์šฉ์œผ๋กœ ์จ์„œ ์™„๋ฒฝํ•œ ํ”ฝ์…€ ์ •๋ ฌ(paired) ์„ ์–ป์œผ๋ฏ€๋กœ ๋‹จ์ˆœํ•œ Pix2Pix๋กœ ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ดˆ๋‹น ์ˆ˜๋ฐฑ ํ”„๋ ˆ์ž„์˜ ์ถ”๋ก  ์†๋„๋ฅผ ํ™•๋ณดํ•ฉ๋‹ˆ๋‹ค.

flowchart LR
    A[FEM deformation<br/>TacFlex] --> B[NOCS map]
    A --> C[Depth map]
    B --> D[Pix2Pix]
    C --> D
    D --> E[Photorealistic<br/>RGB tactile image]
    F[High-fidelity<br/>differentiable render] -.supervision.-> D
    style B fill:#c8e6c9
    style C fill:#ffcdd2

์‹คํ—˜

๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ

FEM ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ TacFlex๋ฅผ ํ™•์žฅํ•ด RGBยท๊นŠ์ดยท๋ฉ”์‹œยทNOCS๋ฅผ ํ•จ๊ป˜ ๋ Œ๋”๋งํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ ์‹คํ—˜์— ์“ด ๊ฒƒ๊ณผ ๋™์ผํ•œ 10์ข…์˜ ์ธ๋ดํ„ฐ(indenter, ๋ˆ„๋ฆ„ ๋„๊ตฌ) ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ์ธ๋ดํ„ฐ๋งˆ๋‹ค 50ํšŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์‹œ๋„๋ฅผ ๋ฌด์ž‘์œ„ ์ดˆ๊ธฐ ์œ„์น˜ยทํšŒ์ „์œผ๋กœ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ์‹œํ€€์Šค๋Š” โ‘  ์ˆ˜์ง ์••์ถ•, โ‘ก ๋„ค ๋ฐฉํ–ฅ ๋ณ‘์ง„(translation), โ‘ข ํšŒ์ „(rotation) ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ๊ฐ ๋‹จ๊ณ„์˜ ์ตœ์ข… ์ƒํƒœ๋ฅผ ๋‹ค์Œ ๋‹จ๊ณ„์˜ ์ดˆ๊ธฐ ์กฐ๊ฑด์œผ๋กœ ์“ฐ๋Š” ์ˆœ๋ฐฉํ–ฅ ๊ฒฝ๋กœ๋งŒ ๋ณด์กดํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ FEM ์˜ค์ฐจ ๋ˆ„์ ์„ ๋ง‰๊ณ  ์ƒ์„ฑ์„ ๊ฐ€์†ํ•ฉ๋‹ˆ๋‹ค. ์ตœ์ข… ๋ฐ์ดํ„ฐ์…‹์€ 5๋งŒ 6์ฒœ ๊ฐœ ์ด์ƒ์˜ ์ƒ˜ํ”Œ(RGB, ๊นŠ์ด, ํ‘œ๋ฉด ๋ฉ”์‹œ, NOCS ํฌํ•จ)์ž…๋‹ˆ๋‹ค.

์‹ค์ œ ๊ฒ€์ฆ ํ™˜๊ฒฝ

  • ์„ผ์„œ: GelSight Mini.
  • ๋กœ๋ด‡: ๋ณด์ •๋œ Franka Panda ํŒ”์— ์žฅ์ฐฉ.
  • ํ”„๋กœํ† ์ฝœ: (a) ์„ผ์„œ ํ‘œ๋ฉด์„ ์ธ๋ดํŒ…, (b) ๋„ค ๋ฐฉํ–ฅ์œผ๋กœ 2mm ๋ณ‘์ง„, (c) ยฑ5ยฐ ํšŒ์ „์œผ๋กœ ์ „๋‹จ ์‘๋‹ต ํฌ์ฐฉ. 10์ข… ์ธ๋ดํ„ฐ๋กœ ๋‹ค์–‘ํ•œ ์ ‘์ด‰ ๊ธฐํ•˜๋ฅผ ํ‰๊ฐ€.

ํŠนํžˆ ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ๊ฐ€ ์ฃผ๋กœ ๋‹ค๋ฃฌ ์ˆ˜์ง ์••์ž…(2mm ๊นŠ์ด)ยท๋ณ‘์ง„๋ฟ ์•„๋‹ˆ๋ผ ํšŒ์ „ ๋ณ€ํ˜•๊นŒ์ง€ ํฌํ•จํ•œ ์ ์ด ์ฐจ๋ณ„์ ์ž…๋‹ˆ๋‹ค. ํ‰๊ฐ€ ์ง€ํ‘œ๋Š” ์ด๋ฏธ์ง€ ๊ธฐ๋ฐ˜(MSE, PSNR, SSIM, SMAPE)๊ณผ ๊ธฐํ•˜ ์ธ์ง€ ์ธก์ •(๋งˆ์ปค ์œ„์น˜ ์ •ํ™•๋„, ๋ณ€ํ˜•์žฅ ์ผ๊ด€์„ฑ)์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ •๋Ÿ‰ ๊ฒฐ๊ณผ: NOCS vs Depth ์ž…๋ ฅ

Pix2Pix ๋ชจ๋ธ์˜ ์ž…๋ ฅ ํ‘œํ˜„ ๋น„๊ต(๊ฒ€์ฆ์…‹ 1,000์žฅ, ๋…ผ๋ฌธ Table I):

์ž…๋ ฅ (ID) SSIM โ†‘ PSNR (dB) โ†‘ MSE โ†“
Depth 0.9543 38.45 9.36
NOCS 0.9609 38.74 8.69

ํ•ด์„: NOCS ์ž…๋ ฅ์ด ์„ธ ์ง€ํ‘œ ๋ชจ๋‘์—์„œ ๊นŠ์ด ์ž…๋ ฅ์„ ๋Šฅ๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ฐจ์ด๊ฐ€ ๊ฑฐ๋Œ€ํ•˜์ง€๋Š” ์•Š์ง€๋งŒ(SSIM ์•ฝ +0.007, MSE ์•ฝ 7% ๊ฐ์†Œ), ๋ฐฉํ–ฅ์ด ์ผ๊ด€๋˜๋ฉฐ โ€” ๋” ์ค‘์š”ํ•œ ๊ฒƒ์€ NOCS๊ฐ€ ๊นŠ์ด๋กœ๋Š” ํ‘œํ˜„ ๋ถˆ๊ฐ€๋Šฅํ•œ ์ „๋‹จยทํšŒ์ „ ๋ณ€ํ˜•์„ ๋‹ด๋Š”๋‹ค๋Š” ์งˆ์  ์šฐ์œ„์ž…๋‹ˆ๋‹ค. ์ฆ‰ ์ˆซ์ž๊ฐ€ ๋น„์Šทํ•œ ์˜์—ญ์—์„œ๋„ NOCS๋Š” ๊นŠ์ด๊ฐ€ ์•„์˜ˆ ๋ชป ๋งŒ๋“œ๋Š” ์ •๋ณด๋ฅผ ๋ณด์กดํ•ฉ๋‹ˆ๋‹ค.

์ •์„ฑ ๊ฒฐ๊ณผ

  • Fig. 2 (ํฌ์ฆˆ + ์œ ๋ฆฌ): ์œ ๋ฆฌ ๋ชจ๋ธ๋ง ์—†์ด๋Š” ๋งˆ์ปค ์ •๋ ฌ์ด ๋ˆˆ์— ๋„๊ฒŒ ์–ด๊ธ‹๋‚˜๊ณ , ์œ ๋ฆฌ ๊ตด์ ˆ์„ ๋„ฃ์œผ๋ฉด ์œ„์น˜ยทํฌ๊ธฐ๊ฐ€ ์ •ํ™•ํžˆ ์ •๋ ฌ๋จ. ์ƒ‰ ์ฝ”๋”ฉ(ํฐ=์ผ์น˜, ์‹œ์•ˆ=๋ˆ„๋ฝ, ๋งˆ์  ํƒ€=์ดˆ๊ณผ)์œผ๋กœ ์ •๋ ฌ ํ’ˆ์งˆ์„ ์‹œ๊ฐํ™”.
  • Fig. 3 / Fig. 6 (Pix2Pix): NOCSโ†’RGB, Depthโ†’RGB ๋ณ€ํ™˜ ์ถœ๋ ฅ์„ ์ž…๋ ฅยท์ƒ์„ฑยทGT ์‚ผ์ค‘์Œ์œผ๋กœ ๋น„๊ต. NOCS ๊ฒฝ๋กœ๊ฐ€ ๋” ๋‚˜์€ ๋ณ€ํ˜• ์ •๋ณด๋ฅผ ๋ณด์กด.
  • ์œ ๋ฆฌ ํšจ๊ณผ ๊ฒ€์ฆ 3์ง€ํ‘œ: โ‘  ๋งˆ์ปค ์ •๋ ฌ ์ •๋ฐ€๋„, โ‘ก ์‹œ๋ฎฌ-์‹ค์ œ ๋งˆ์ปค ํฌ๊ธฐ ์ผ๊ด€์„ฑ, โ‘ข ํ…์Šค์ฒ˜ ๋ฉ”์‹œ ์œ„ ๋งˆ์ปค ํŒจํ„ด ์ธก์ •. ์ €์ž๋“ค์€ ์‹ค์ œ ์„ผ์„œ์™€ โ€œ์™„๋ฒฝํ•œ ๋งค์นญโ€๊ณผ ํฌ๊ธฐ ์ผ๊ด€์„ฑ์„ ๋‹ฌ์„ฑํ–ˆ๋‹ค๊ณ  ๋ณด๊ณ ํ•ฉ๋‹ˆ๋‹ค.

์˜๋ฏธ

์ด ์‹คํ—˜๋“ค์ด ๋งํ•˜๋Š” ๋ฐ”๋ฅผ ํ•œ ์ค„๋กœ ์š”์•ฝํ•˜๋ฉด: ์œ ๋ฆฌ ๊ตด์ ˆ์„ ๋ฌผ๋ฆฌ ๋ชจ๋ธ ์•ˆ์— ๋„ฃ๊ณ  ํฌ์ฆˆยท์กฐ๋ช…์„ ์ž๋™ ์ตœ์ ํ™”ํ•˜๋ฉด, ์ˆ˜๋™ ๋ณด์ • ์—†์ด๋„ ๋งˆ์ปค ์ˆ˜์ค€์˜ ์ •๋ฐ€ ์ •๋ ฌ๊ณผ ํšŒ์ „ยท์ „๋‹จ์„ ํฌํ•จํ•œ ๋‹ค์ถ•(multi-axis) ๋ณ€ํ˜• ์žฌํ˜„์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  NOCS+Pix2Pix๋กœ ๊ทธ ๊ณ ์ถฉ์‹ค๋„๋ฅผ ์‹ค์‹œ๊ฐ„ ์†๋„๋กœ โ€œ์••์ถ•โ€ํ•ด ๋‹ค์šด์ŠคํŠธ๋ฆผ ํ•™์Šต์— ์“ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋น„ํŒ์  ๊ณ ์ฐฐ

๊ฐ•์ 

  • ์ž๋™ํ™”์™€ ์ตœ์†Œ ๋ฐ์ดํ„ฐ: ๋‹จ 3์žฅ์˜ ์‹ค์ œ ์ด๋ฏธ์ง€๋กœ ์นด๋ฉ”๋ผ ํฌ์ฆˆยท์กฐ๋ช…ยทํ…์Šค์ฒ˜๋ฅผ ์ž๋™ ๋ณด์ •. ์ œ์กฐ ๊ณต์ฐจ๋กœ ๊ฐœ์ฒด๋งˆ๋‹ค ๋‹ค๋ฅธ ํฌ์ฆˆ ๋ฌธ์ œ๋ฅผ ์ •๋ฉด์œผ๋กœ ํ•ด๊ฒฐํ•œ ์ ์ด ์‹ค๋ฌด์ ์œผ๋กœ ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. โ€œํฌ์ฆˆ๋ฅผ ์•ˆ๋‹ค๊ณ  ๊ฐ€์ •โ€ํ•˜๋˜ ๊ด€ํ–‰์„ ๊นฌ ๊ฒƒ์ด ํ•ต์‹ฌ.
  • ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์ผ๊ด€์„ฑ: ์œ ๋ฆฌ ๊ตด์ ˆ์„ ์‚ฌํ›„ Snell ๋ณด์ •์œผ๋กœ ๋•œ์งˆํ•˜์ง€ ์•Š๊ณ  ๊ด‘ ์ˆ˜์†ก ๋ชจ๋ธ ์•ˆ์— ํ†ตํ•ฉ. ์œ„์น˜๋ฟ ์•„๋‹ˆ๋ผ ํฌ๊ธฐ ์™œ๊ณก๊นŒ์ง€ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค.
  • ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ์„ฑ์˜ ๋ถ€๊ฐ€ ๊ฐ€์น˜: ๊ฐ™์€ ๋ Œ๋”๋Ÿฌ๋กœ ์—ญ๋ฌธ์ œ(๋‹จ์ผ ์ด๋ฏธ์ง€โ†’๋ฉ”์‹œ ๋ณต์›)๋ฅผ ํ’€ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์€ ๋‹จ์ˆœํ•œ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ๊ธฐ๋ฅผ ๋„˜์–ด์„œ๋Š” ํ™•์žฅ์„ฑ์ž…๋‹ˆ๋‹ค.
  • ์†๋„-์ถฉ์‹ค๋„์˜ ์˜๋ฆฌํ•œ ๋ถ„๋ฆฌ: ๋А๋ฆฐ ๊ฒฝ๋กœ ์ถ”์ ์€ ์˜คํ”„๋ผ์ธ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ์—๋งŒ ์“ฐ๊ณ , ์‹ค์‹œ๊ฐ„ ์ถ”๋ก ์€ Pix2Pix์— ์œ„์ž„ํ•˜๋Š” 2๋‹จ ์ „๋žต์ด ์‹ค์šฉ์ ์ž…๋‹ˆ๋‹ค.
  • NOCS ํ‘œํ˜„์˜ ํ†ต์ฐฐ: ๊นŠ์ด๊ฐ€ ๋ชป ๋‹ด๋Š” ์ „๋‹จยทํšŒ์ „์„ ์ธ์ฝ”๋”ฉํ•˜๋Š” NOCS๋ฅผ ์ด‰๊ฐ ์ž…๋ ฅ์œผ๋กœ ๋„์ž…ํ•œ ๊ฒƒ์€ ์˜๋ฆฌํ•˜๊ณ , ํ–ฅํ›„ ๋‹ค๋ฅธ ์ด‰๊ฐ ํ•™์Šต์—๋„ ์žฌ์‚ฌ์šฉ ๊ฐ€๋Šฅ์„ฑ์ด ํฝ๋‹ˆ๋‹ค.

์•ฝ์ ๊ณผ ํ•œ๊ณ„

  • ์ •๋Ÿ‰ ์šฐ์œ„์˜ ํญ์ด ์ž‘์Œ: NOCS vs Depth ํ‘œ๋Š” ๋ฐฉํ–ฅ์„ฑ์€ ๋งž์ง€๋งŒ ์ ˆ๋Œ€ ๊ฒฉ์ฐจ๊ฐ€ ์ž‘์Šต๋‹ˆ๋‹ค(SSIM 0.9609 vs 0.9543). ๋˜ํ•œ ๋ณธ๋ฌธ์ด โ€œ๊ธฐ์กด ๋ฐฉ๋ฒ• ๋Œ€๋น„ ๋ชจ๋“  ์ง€ํ‘œ์—์„œ ๊ฐœ์„ โ€์„ ์ฃผ์žฅํ•˜์ง€๋งŒ, ์™ธ๋ถ€ ๋ฒ ์ด์Šค๋ผ์ธ๊ณผ์˜ ์ง์ ‘ ๋น„๊ต ์ˆ˜์น˜ ํ‘œ๊ฐ€ ์ œ์‹œ๋˜์ง€ ์•Š์•„ ๊ทธ SOTA ์ฃผ์žฅ์€ ๊ฒ€์ฆํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. (์ถ”์ธก) ์›Œํฌ์ˆ/์งง์€ ๋…ผ๋ฌธ ํ˜•์‹์ด๋ผ ํ‘œ๊ฐ€ ์ถ•์•ฝ๋˜์—ˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์—ฐ์‚ฐ ๋น„์šฉ: ๊ตด์ ˆ ๋งค์งˆ์„ ํ†ตํ•œ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๊ฒฝ๋กœ ์ถ”์ ์€ ๋ฌด๊ฒ์Šต๋‹ˆ๋‹ค. ์ €์ž๋„ ์ด๋ฅผ ํ•œ๊ณ„๋กœ ์ธ์ •ํ•˜๋ฉฐ Pix2Pix๋กœ ์šฐํšŒํ•˜์ง€๋งŒ, ์ด๋Š” ์˜คํ”„๋ผ์ธ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ์— ํ•œ์ •๋œ ํ•ด๋ฒ•์ด๋ฉฐ ๋ณด์ • ๋‹จ๊ณ„ ์ž์ฒด์˜ ๋น„์šฉ์€ ๋‚จ์Šต๋‹ˆ๋‹ค.
  • ๋งˆ์ปค ์˜์กด์„ฑ: ์ •๋ ฌ ์†์‹ค(์‹ 2, 3, 7)์ด ๋งˆ์ปค ์˜์—ญ์— ๊ฐ•ํ•˜๊ฒŒ ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ์ €์ž ์Šค์Šค๋กœ ํ•œ๊ณ„ ์ ˆ์—์„œ ์ธ์ •ํ•˜๋“ฏ, ์—ฃ์ง€ ๊ฒ€์ถœยท์น˜๋ฐ€ํ•œ ์ ‘์ด‰ ํ˜•์ƒ ์ธ์‹์ฒ˜๋Ÿผ ๋งˆ์ปค๊ฐ€ ํฌ์†Œํ•ด ์ •๋ณด๊ฐ€ ๋ถ€์กฑํ•œ ์ž‘์—…์—๋Š” ๋ถ€์ ํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ผ๋ฐ˜ํ™” ๊ฒ€์ฆ ๋ถ€์กฑ: GelSight Mini ๋‹จ์ผ ์„ผ์„œ์— ๋Œ€ํ•ด์„œ๋งŒ ๊ฒ€์ฆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๋น„์ „ ์ด‰๊ฐ ์„ผ์„œ(DIGIT, GelSlim ๋“ฑ)๋กœ์˜ ์ผ๋ฐ˜ํ™”๋Š” ํ–ฅํ›„ ๊ณผ์ œ๋กœ ๋‚จ์Šต๋‹ˆ๋‹ค.
  • ์žฌํ˜„์„ฑ ์ •๋ณด: ์˜ตํ‹ฐ๋งˆ์ด์ €, ํ•™์Šต๋ฅ , ๋ฐ˜๋ณต ํšŸ์ˆ˜, ๋‹จ๊ณ„๋ณ„ ์ˆ˜๋ ด ๊ธฐ์ค€ ๋“ฑ ๊ตฌ์ฒด์  ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๋ณธ๋ฌธ์— ์ถฉ๋ถ„ํžˆ ๋“œ๋Ÿฌ๋‚˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค(๊ฐ€์ค‘์น˜ \alpha,\beta ์ •๋„๋งŒ ๋ช…์‹œ).

๊ด€๋ จ ์—ฐ๊ตฌ์™€์˜ ๋น„๊ต

์ ‘๊ทผ ๋Œ€ํ‘œ ์—ฐ๊ตฌ ๋ Œ๋”๋ง ๊ตด์ ˆ ์ฒ˜๋ฆฌ ํฌ์ฆˆ ๊ฐ€์ • ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ
๊ฐ•์ฒด+๋ž˜์Šคํ„ฐํ™” TACTO, Taxim, Tactile Gym rasterization + ์ˆ˜๋™ LED ๋ฌด์‹œ/๋‹จ์ˆœ ์•Œ๋ ค์ง ์•„๋‹ˆ์˜ค
๋ ˆ์ด ํŠธ๋ ˆ์ด์‹ฑ Agarwal et al. (ICRAโ€™21) ๊ทธ๋ฆผ์ž์šฉ ray tracing ๋ถ€๋ถ„ ์•Œ๋ ค์ง ์•„๋‹ˆ์˜ค
FEM + Snell ๋ณด์ • TacFlex ๋งˆ์ปค ํ…์Šค์ฒ˜ + ๋ฐฐ๊ฒฝ ์‚ฌํ›„ Snell ๋ณด์ • ์•Œ๋ ค์ง ์•„๋‹ˆ์˜ค
MPM ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ DiffTactile ๋ฌผ๋ฆฌ ๋ฏธ๋ถ„ ํ•ด๋‹น ์—†์Œ - ๋งˆ์ปค ์œ„์น˜๋งŒ
๋ณธ ๋…ผ๋ฌธ Duret et al. (ICRAโ€™26) ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๊ฒฝ๋กœ ์ถ”์ (Mitsuba 3) ๊ด‘ ์ˆ˜์†ก ๋‚ด์žฅ ์ตœ์ ํ™” ์ „์ฒด RGB ์ด๋ฏธ์ง€

์š”์ง€: DiffTactile ๋“ฑ ๊ธฐ์กด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ์ด‰๊ฐ ์—ฐ๊ตฌ๊ฐ€ ๋งˆ์ปค ๋ณ€์œ„๊นŒ์ง€๋งŒ ๋ฏธ๋ถ„ํ–ˆ๋‹ค๋ฉด, ์ด ๋…ผ๋ฌธ์€ ์ตœ์ข… RGB ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์ „์ฒด๋ฅผ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“  ์ตœ์ดˆ ์‚ฌ๋ก€๋ผ๋Š” ์ ์ด ์ฐจ๋ณ„์ ์ž…๋‹ˆ๋‹ค. ๋˜ TacFlex์˜ ์ˆ˜๋™ ์œ ๋ฆฌ ๋ณด์ •์„ ๋ฌผ๋ฆฌ ๋ชจ๋ธ ํ†ตํ•ฉ์œผ๋กœ ๋Œ€์ฒดํ•˜๊ณ , ์•Œ๋ ค์กŒ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋˜ ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋ฅผ ์ตœ์ ํ™” ๋Œ€์ƒ์œผ๋กœ ๋Œ์–ด์˜ฌ๋ฆฐ ์ ์ด ์ง„์ผ๋ณด์ž…๋‹ˆ๋‹ค.

์š”์•ฝ ๋ฐ ๊ฒฐ๋ก 

์ด ๋…ผ๋ฌธ์€ ๋น„์ „ ๊ธฐ๋ฐ˜ ์ด‰๊ฐ ์„ผ์„œ์˜ sim-to-real ๊ฒฉ์ฐจ๋ฅผ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๊ฒฝ๋กœ ์ถ”์ (Mitsuba 3) ์œผ๋กœ ์ •๋ฉด ๋ŒํŒŒํ•ฉ๋‹ˆ๋‹ค. ํ•ต์‹ฌ์€ ๋‹จ 3์žฅ์˜ ์‹ค์ œ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ์นด๋ฉ”๋ผ ํฌ์ฆˆยท์กฐ๋ช…ยทํ…์Šค์ฒ˜๋ฅผ ๊ฒฝ์‚ฌํ•˜๊ฐ•์œผ๋กœ ์ž๋™ ๋ณด์ •ํ•˜๊ณ , ๊ทธ๋™์•ˆ ๊ฐ€์žฅ ๊ณจ์น˜ ์•„ํŒ ๋˜ ์œ ๋ฆฌ ๊ตด์ ˆ์„ ์‚ฌํ›„ ๋ณด์ •์ด ์•„๋‹ˆ๋ผ ๋ฌผ๋ฆฌ ๊ด‘ ์ˆ˜์†ก ๋ชจ๋ธ ์•ˆ์—์„œ ์ฒ˜๋ฆฌํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ์€ (1) ์œ ๋ฆฌ ์—†๋Š” ํฌ์ฆˆ โ†’ (2) ์œ ๋ฆฌ ํฌํ•จ ํฌ์ฆˆ ์ •๋ฐ€ํ™” โ†’ (3) ์กฐ๋ช…ยท์žฌ์งˆ ๊ฒฐํ•ฉ โ†’ (4) ๋งˆ์ปค ํ…์Šค์ฒ˜์˜ ๋‹จ๊ณ„์  ์ตœ์ ํ™”๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ, ๊ฐ™์€ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋Ÿฌ๋กœ ๋‹จ์ผ ์ด๋ฏธ์ง€์—์„œ ๋ฉ”์‹œ๋ฅผ ๋ณต์›ํ•˜๋Š” ์—ญ๋ฌธ์ œ๊นŒ์ง€ ํ’‰๋‹ˆ๋‹ค. ๋ฌด๊ฑฐ์šด ๊ฒฝ๋กœ ์ถ”์ ์˜ ๋น„์šฉ์€ NOCS ๋งต ๊ธฐ๋ฐ˜ Pix2Pix ๋กœ ์šฐํšŒํ•ด ์‹ค์‹œ๊ฐ„ ์ถ”๋ก ์„ ์–ป๊ณ , NOCS๊ฐ€ ๊นŠ์ด๋ณด๋‹ค ์ „๋‹จยทํšŒ์ „ ๋ณ€ํ˜•์„ ์ž˜ ๋‹ด์•„ ๋ชจ๋“  ์ง€ํ‘œ์—์„œ ์šฐ์œ„๋ฅผ ๋ณด์˜€์Šต๋‹ˆ๋‹ค(SSIM 0.9609, PSNR 38.74dB, MSE 8.69). GelSight Mini + Franka Panda + 10์ข… ์ธ๋ดํ„ฐ, 5๋งŒ 6์ฒœ์—ฌ ์ƒ˜ํ”Œ๋กœ ๊ฒ€์ฆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๋กœ๋ด‡๊ณตํ•™ ์‹ค๋ฌด์ž์—๊ฒŒ ์ฃผ๋Š” ๋ฉ”์‹œ์ง€๋Š” ๋ถ„๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์„ผ์„œ ๊ฐœ์ฒด๋งˆ๋‹ค ๋‹ค๋ฅธ ํฌ์ฆˆ๋ฅผ ์ˆ˜๋™์œผ๋กœ ๋ณด์ •ํ•˜๋А๋ผ ์‹œ๊ฐ„์„ ์“ฐ๋Š” ๋Œ€์‹ , ๋ฏธ๋ถ„ ๊ฐ€๋Šฅ ๋ Œ๋”๋Ÿฌ์—๊ฒŒ โ€œ์‹ค์ œ ์ด๋ฏธ์ง€์— ๋งž์ถฐ ์Šค์Šค๋กœ ๋ณด์ •ํ•˜๋ผโ€๊ณ  ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” ์ด‰๊ฐ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ๊ณผ ์ด‰๊ฐ ๊ธฐ๋ฐ˜ ํ˜•์ƒ ์ธ์‹ ๋ชจ๋‘์— ๊ฒฌ๊ณ ํ•œ ํ† ๋Œ€๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.

๋‚จ์€ ๊ณผ์ œ๋Š” ๋‹ค๋ฅธ ์ด‰๊ฐ ์„ผ์„œ๋กœ์˜ ์ผ๋ฐ˜ํ™”, ์™ธ๋ถ€ ๋ฒ ์ด์Šค๋ผ์ธ ๋Œ€๋น„ ์ •๋Ÿ‰ ๋น„๊ต์˜ ๋ณด๊ฐ•, ๊ทธ๋ฆฌ๊ณ  ๋งˆ์ปค์— ์˜์กดํ•˜์ง€ ์•Š๋Š” ๋ฐ€์ง‘(dense) RGB ๊ธฐ์šธ๊ธฐ ๋กœ์˜ ํ™•์žฅ(์ €์ž๋“ค์ด DiffTactile ํ†ตํ•ฉ์œผ๋กœ ์ œ์‹œํ•œ ๋ฐฉํ–ฅ)์ž…๋‹ˆ๋‹ค. ํฌ์†Œํ•œ ๋งˆ์ปค๋กœ๋Š” ๋ถ€์กฑํ•œ ์ž‘์—…โ€”์—ฃ์ง€ ๊ฒ€์ถœ, ํ˜•์ƒ ๋ณต์›, ํฌ์ฆˆ ์ถ”์ •โ€”์—์„œ ์ง„๊ฐ€๋ฅผ ๋ฐœํœ˜ํ•˜๋ ค๋ฉด ์ด ํ™•์žฅ์ด ๊ด€๊ฑด์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ•œ ์ค„ ์š”์•ฝ: โ€œ์œ ๋ฆฌ ๊ตด์ ˆ๊นŒ์ง€ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ค๋ฉด, ์ด‰๊ฐ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋Š” ์‹ค์ œ ์ด๋ฏธ์ง€๋ฅผ ๋ณด๊ณ  ์Šค์Šค๋กœ ๋ณด์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.โ€

Note์ฐธ๊ณ : ๋ณธ ๋ฆฌ๋ทฐ์˜ ์ถœ์ฒ˜ ํ™•๋ณด ๊ณผ์ •

์›๋ฌธ PDF๋Š” ๋ด‡ ์ฐจ๋‹จ(์ ‘๊ทผ ์ œํ•œ)์œผ๋กœ ์ง์ ‘ ํŒŒ์‹ฑ์ด ๋ง‰ํ˜€, OpenAlex๋กœ ์„œ์ง€์ •๋ณด(์ €์ž: Guillaume Duret, Anna Samsonenko, Florence Zara, Jan Peters, Liming Chen / ICRA 2026 ์›Œํฌ์ˆ)๋ฅผ ํ™•์ธํ•œ ๋’ค HAL ์›๋ฌธ(hal-05488623)์˜ ์ „์ฒด ๋ณธ๋ฌธ์„ ํ™•๋ณดํ•ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. ํ‘œยท์ˆ˜์‹ยท์‹คํ—˜ ์ˆ˜์น˜๋Š” ๋ชจ๋‘ ์›๋ฌธ ๋ณธ๋ฌธ์— ๊ทผ๊ฑฐํ•˜๋ฉฐ, ๋ช…์‹œ๋˜์ง€ ์•Š์€ ๋ถ€๋ถ„์€ โ€œ(์ถ”์ธก)โ€์œผ๋กœ ํ‘œ๊ธฐํ–ˆ์Šต๋‹ˆ๋‹ค.

Copyright 2026, JungYeon Lee