Curieux.JY
  • Post
  • Note
  • Jung Yeon Lee

On this page

  • 1 Introduction
  • 2 Problem Statement
  • 3 Method
    • 3.1 Grasp Sampler
    • 3.2 Grasp Evaluator
    • 3.3 CONG Dataset
  • 4 Experiment
    • 4.1 A. Simulated Robotic Grasping
  • 5 Reference

๐Ÿ“ƒVCGS ๋ฆฌ๋ทฐ

grasp
pointcloud
vae
paper
Variational Constrained Grasp Sample
Published

March 17, 2024

์ด๋ฒˆ ํฌ์ŠคํŒ…์€ Variational Constrained Grasp Sample ๋…ผ๋ฌธ์„ ์ฝ๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ํ•ด๋‹น ๋…ผ๋ฌธ์€ IROS 2023 ํ•™ํšŒ์— Accept๋œ ๋…ผ๋ฌธ์œผ๋กœ, ํŠน์ • ๋Œ€์ƒ ์˜์—ญ์— ๋Œ€ํ•œ ์ œ์•ฝ์„ ๊ฐ€์ง„ 6์ž์œ ๋„(DoF) Grasp์„ ์ƒ˜ํ”Œ๋งํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ƒ์„ฑ์  ๊ทธ๋ฆฌํ•‘ ์ƒ˜ํ”Œ๋ง ๋„คํŠธ์›Œํฌ, VCGS๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ 1,400๋งŒ ๊ฐœ ์ด์ƒ์˜ ํ›ˆ๋ จ ์ƒ˜ํ”Œ์„ ํฌํ•จํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์…‹ CONG๋ฅผ ๊ตฌ์ถ•ํ•œ ๋‚ด์šฉ์„ ๋ฐœํ‘œํ–ˆ์Šต๋‹ˆ๋‹ค. ์ œ์•ˆ๋œ VCGS๊ฐ€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ฐ ์‹ค์ œ ํ…Œ์ŠคํŠธ์—์„œ ๋น„๊ต ๋ชจ๋ธ์ธ GraspNet๋ณด๋‹ค 10-15% ๋†’์€ ๊ทธ๋ฆฌํ•‘ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์ด๋ฉฐ, 2-3๋ฐฐ ๋” ํšจ์œจ์ ์ธ ๊ฒƒ์„ ๋ณด์—ฌ์ค€ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค.

1 Introduction

๋กœ๋ด‡ ํŒ”, Manipulator๋กœ ๋ฌผ์ฒด๋ฅผ ์žก๋Š” Task๋Š” ๊ณต์ •์—์„œ๋ถ€ํ„ฐ ๊ฐ€์ •์—์„œ ์“ฐ์ผ ์ˆ˜ ์žˆ๋Š” ๋ณด์กฐ ์—ญํ• ๊นŒ์ง€ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ณธ์ ์ธ Task๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ฌผ์ฒด๋ฅผ ์žก๋Š”๋‹ค๋Š” Task, ์ฆ‰ Grasping(ํ˜น์€ Gripping)์„ ์ธ์ง€ ๋‹จ๊ณ„๋ถ€ํ„ฐ ์ œ์–ด ๋ชจ์…˜ ๋‹จ๊ณ„๊นŒ์ง€ ์ž˜ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ๊ณ ์ž ํ•˜๋Š” ์—ฐ๊ตฌ๋“ค์ด ๋งŽ์ด ์ง„ํ–‰๋˜์–ด ์™”์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐฐ๊ฒฝ ํ•˜์—, ํŠน์ • ๊ฐ์ฒด ๋ถ€๋ถ„๊ณผ์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์š”๊ตฌํ•˜๋Š” ์ž‘์—…์—์„œ ์ •๋ฐ€ํ•œ ๊ทธ๋ฆฌํ•‘ ์œ„์น˜๋ฅผ ์ง€์ •ํ•˜๋Š” ๊ฒƒ์˜ ์ค‘์š”์„ฑ์ด ๋ถ€๊ฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ํŠน์ • ๋Œ€์ƒ ์˜์—ญ์— ๋Œ€ํ•œ ์ œ์•ฝ์„ ๊ฐ€์ง„ 6์ž์œ ๋„ ๊ทธ๋ฆฌํ•‘์„ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

6์ž์œ ๋„ ๊ทธ๋ฆฌํ•‘์€ ๋ณดํ†ต top-down ๋ฐฉ์‹์˜ 4์ž์œ ๋„ ๊ทธ๋ฆฌํ•‘๋ณด๋‹ค ๋” ์ž์œ ๋„๊ฐ€ ๋†’์€ ์ œ์–ด๋ฅผ ์š”๊ตฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋” ๋„์ „์ ์ธ Task๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌผ๋ฆฌ ์„ธ๊ณ„์˜ 3d ๋ฌผ์ฒด๋ฅผ ์ธ์‹ํ•˜๊ธฐ ์œ„ํ•ด point cloud data๋กœ 3์ฐจ์› ๋ฌผ์ฒด๋ฅผ ์ธ์‹ํ•˜๊ณ  ๋ฌผ์ฒด๋ฅผ ์•ˆ์ •์ ์œผ๋กœ ์žก์„ grasp pose๋ฅผ ์ตœ์ข…์ ์œผ๋กœ ๋งŒ๋“ค์–ด์„œ Manipulator๋ฅผ ์›€์ง์ด๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

VCGS Process

ํ•ด๋‹น ์—ฐ๊ตฌ์—์„œ๋Š” VCGS๋ผ๋Š” ์•ˆ์ •์ ์ธ grasp pose๋ฅผ ์ œ์•ˆํ•  ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ํ•ด๋‹น ๋ชจ๋ธ์„ ์ž˜ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ๊ณ  Constrained Grasp์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ๋งŒ๋“ค์–ด์„œ CONG๋ผ๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ๋งŒ๋“œ๋Š” ๋ถ€๋ถ„์—๋„ ๊ธฐ์—ฌ๋ฅผ ํ–ˆ์Šต๋‹ˆ๋‹ค. Table 1์—์„œ๋„ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ์…‹์— ๋น„ํ•ด์„œ ์ ˆ๋Œ€์ ์ธ ๋ฐ์ดํ„ฐ์…‹์˜ ํฌ๊ธฐ๊ฐ€ ํด ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ Task-agnosticํ•œ Constraint์— ๋Œ€ํ•œ ์ •๋ณด๋„ ๋“ค์–ด์žˆ๋Š” ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์œ ์ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ ํŠน์„ฑ์ด ์ค‘์š”ํ•˜๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

CONG Dataset Comparison

2 Problem Statement

Grasp pose๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ฌธ์ œ์˜ ์šฉ์–ด๋“ค์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์—ฐ๊ตฌ์—์„œ ์‚ฌ์šฉํ•œ Gripper๋Š” ๋‹จ์ˆœํ•œ 2๊ฐœ์˜ ์†๊ฐ€๋ฝ์œผ๋กœ ์ง‘๋Š” ๋ชจ์…˜์„ ํ•˜๋Š” ๊ทธ๋ฆฌํผ๋กœ parallel-jaw grasp pose๋ฅผ ์ •ํ•˜๋ฉด ์žก๋Š” ๋ชจ์…˜ ์ œ์–ด๋ฅผ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • G: parallel-jaw grasp poses
    • 7์ฐจ์›: quaternion[4] + 3d-position[3]
    • Target area A์— ์žˆ๊ณ  stable(S=1)ํ•œ ์กฐ๊ฑด์ด ๋งŒ์กฑ๋  ๋•Œ
  • O: Object Point-cloud (์ฐจ์›: Nx3)
  • A: Target Area Point-cloud (์ฐจ์›: Mx3)
Point cloud data์™€ Joint Probability Separation

VCGS๊ฐ€ ํ•™์Šตํ•ด์•ผ ํ•˜๋Š” ๋ชฉํ‘œ๋Š” P(G, S | O, A) ๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค. ์ด ์ˆ˜์‹์„ ํ’€์–ด์„œ ์ƒ๊ฐํ•ด๋ณด๋ฉด, object์˜ point cloud data์™€ ์žก์•„์•ผ ํ•˜๋Š” ๋ถ€๋ถ„์ธ target area point cloud data๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ์„ฑ๊ณต์ ์ธ(S) grasp G๋ฅผ ์ƒ์„ฑํ•˜๋Š” ํ™•๋ฅ ์„ ๋†’์ด๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ Grasp Sampler์™€ Grasp Evaluator ๊ฐ๊ฐ์œผ๋กœ Approximationํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋กœ ๋‘์–ด ํ•™์Šตํ•˜๋Š” ๊ณผ์ •์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Grasp Pose ์„ค๋ช…

3 Method

์ด๋ฒˆ ์žฅ์—์„œ๋Š” ์•ž์„œ ์„ค๋ช…ํ•œ ๋Œ€๋กœ Grasp Sampler์™€ Grasp Evaluator๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š” VCGS ๋ชจ๋ธ๊ณผ CONG dataset์— ๋Œ€ํ•ด์„œ ์ด์•ผ๊ธฐํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

3.1 Grasp Sampler

VCGS์˜ Grasp Sampler์˜ ๊ธฐ๋ณธ์ ์ธ ํ‹€์€ Conditional Variational Autoencoder (CVAE) ๊ตฌ์กฐ๋ฅผ ์ฐจ์šฉํ•ด์„œ ์•„๋ž˜์™€ ๊ฐ™์ด ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. Grasp pose๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ Samplingํ•˜๊ธฐ ์œ„ํ•ด Encoder์™€ Decoder๋ฅผ VAE ๊ตฌ์กฐ๋ฅผ ์ฐจ์šฉํ•˜์—ฌ Gaussian Prior Distribution์„ ์ด์šฉํ•ด์„œ ๊ฐ€๋Šฅํ•œ ๋‹ค์–‘ํ•œ Grasp pose๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„ํ–ˆ์Šต๋‹ˆ๋‹ค.

C-VAE ๊ตฌ์กฐ
Loss Function

3.2 Grasp Evaluator

ํ•™์Šต๋™์•ˆ์— ์ข‹์€ Grasp data๋งŒ ํ•™์Šตํ•˜๋Š” Encoder๊ฐ€ ๋” ๋‹ค์–‘ํ•œ Grasp data๋ฅผ ๊ฒฝํ—˜ํ•  ์ˆ˜ ์žˆ๋„๋ก Evaluator Network๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ Bad Grasp์— ๋Œ€ํ•œ ๊ฒฝํ—˜๋„ ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

Input data ํ˜•ํƒœ์™€ Evaluator Network

3.3 CONG Dataset

CONG Dataset ๊ตฌ์ถ•๊ณผ์ •

๊ตฌ์„ฑ ์š”์†Œ

  • O: object point cloud

  • G*: target area A์—์„œ ๋žœ๋คํ•˜๊ฒŒ ์ƒ˜ํ”Œ๋ง๋œ successful grasp

๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ• ๊ณผ์ •

  1. object๋ฅผ ์›์ ์— ๋žœ๋คํ•œ orientation์œผ๋กœ ๋†“๊ณ  O[N x 3] rendering
  2. O์—์„œ query point I[K x 3]๋ฅผ ์ƒ˜ํ”Œ๋ง(K << N) - Farthest Point Sampling ์‚ฌ์šฉ
  3. ๊ฐ query point xi(โˆˆI)์— ๋Œ€ํ•ด์„œ ๋ฐ˜๊ฒฝ ri(~U[0, R]) ์ด์›ƒํ•œ point Ai๋“ค์„ ๋ชจ๋‘ ์ฐพ์Œ
    • ์ด๋•Œ R์€ mesh bounding box์˜ ๋Œ€๊ฐ์„  ๊ธธ์ด
  4. [grasp center point]์™€ [Ai์˜ ์–ด๋–ค ์ ]์ด๋ผ๋„ ์ตœ๋Œ€ d์ธ ๋ชจ๋“  G๋ฅผ ์ฐพ์•„๋ƒ„
mesh ๋ฐ์ดํ„ฐ์—์„œ grasp data๋ฅผ ์ถ”์ถœํ•˜๋Š” ๊ณผ์ •

4 Experiment

์‹คํ—˜์—์„œ ์ฃผ๋ชฉํ•ด๋ด์•ผํ•  2๊ฐ€์ง€ ์งˆ๋ฌธ์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  1. constrained grasping์—์„œ grasp success rate๋กœ ๋‚˜ํƒ€๋‚ด์ง€๋Š” ์„ฑ๋Šฅ
  2. constrained grasp sampler๊ฐ€ unconstrained sampler๋ณด๋‹ค target-driven grasping์—์„œ ์–ผ๋งˆ๋‚˜ sample efficientํ•œ์ง€?
IsaacGym์—์„œ Grasping์„ ํ™•์ธํ•˜๋Š” ๋ชจ์Šต

์‹คํ—˜ ์…‹ํŒ…์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ๋น„๊ต๊ตฐ: GraspNet
  • Simulation & Real Robot ๋‘˜ ๋‹ค ํ™•์ธ
  • Evaluation Metric
    • successful/total
    • successful์˜ ๊ธฐ์ค€: ๋ฌผ์ฒด๋ฅผ ๋“ค๊ณ  predefined motion(linear acc + angular acc)์„ ์ง„ํ–‰ํ•œ ํ›„ ๋ฌผ์ฒด๊ฐ€ gripper์— ์•ˆ์ •์ ์œผ๋กœ ๋“ค๋ ค์žˆ๋Š”์ง€
์‹คํ—˜ ๊ฒฐ๊ณผ

4.1 A. Simulated Robotic Grasping

  • best grasp, NOT the best reachable
  • gripper์™€ object ๋‘˜ ๋‹ค free-floating ์ƒํ™ฉ
  • IsaacGym simulator ์‚ฌ์šฉ
    • Acronym dataset์—์„œ 123๊ฐœ์˜ random object
    • ๋ฌผ์ฒด์˜ observation data๋กœ๋Š” depth sensor ์‚ฌ์šฉ
  • ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ์—์„œ 2๊ฐœ ์‹คํ—˜ ์ง„ํ–‰
    • Unconstrained sampling: target area ์—†์ด ๊ทธ๋ƒฅ grasp์„ ์ƒ˜ํ”Œ๋ง. A=O
    • Constrained sampling: target area์—์„œ๋งŒ grasp ์ƒ์„ฑ
  • ๋น„๊ต๊ตฐ
    • GraspNet: SOTA
    • GraspNetTaI: Target as Input. target area๋งŒ grasp sampling network์— ๋„ฃ์–ด์ค€ ๋ชจ๋ธ
13
  • VCGS๋Š” GraspNet๋ณด๋‹ค 3๋ฐฐ ์ด์ƒ์˜ Ratio of grasps kept %๋ฅผ ๋ณด์—ฌ์คŒ
    • ๋„คํŠธ์›Œํฌ ์ž…๋ ฅ์œผ๋กœ Constrained grasp sampling์„ ๋„ฃ์–ด์ฃผ๋Š” ๊ฒƒ์˜ ์ด์ ์— ๋Œ€ํ•œ ์ฆ๊ฑฐ
  • GraspNetTaI๋Š” GraspNet๋ณด๋‹ค Success Rate๊ฐ€ ๋‚ฎ์Œ
    • ๋ฌผ์ฒด์˜ ์ „์ฒด ์ •๋ณด(global)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํŠน์ •ํ•œ target area์— ๋Œ€ํ•œ ์ •๋ณด(local)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ์ข‹์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Œ
  • GraspNet์€ Success Rate๊ฐ€ # of grasps sampled์— ์˜ํ–ฅ์„ ๋ฐ›์Œ
    • ๋งŒ์•ฝ Unconstrained ๊ฒฝ์šฐ๋ผ๋ฉด ๋” ๋งŽ์€ sampling์ด ํ•„์š”ํ•˜๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Œ
    • RGK๊ฐ€ #GS์— ์˜ํ–ฅ์„ ๋ฐ›์ง€ ์•Š์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด๊ณ ๋„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์„ค์ž„
14
15

๋งˆ์ง€๋ง‰์œผ๋กœ ํ•ด๋‹น ๋…ผ๋ฌธ์˜ ๋ฐœํ‘œ์˜์ƒ์„ ๋งˆ์ง€๋ง‰์œผ๋กœ ์ด๋ฒˆ ํฌ์ŠคํŒ…์„ ๋งˆ๋ฌด๋ฆฌํ•˜๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

5 Reference

  • Original Paper: VCGS
  • Presentation Video
  • Baseline Model: GraspNet paper

Copyright 2024, Jung Yeon Lee