Obstacle avoidance using the Kinect Van-Duc Nguyen Dept. Electronics lectronics & Telecommun Telecommun ications ications Engineering, Ho Chi Minh City City University University of Technology, Vietna Vietna m
[email protected]
Abstract — In this paper, we introduce a new method for mobile — In robot to detect obstacle by using Kinect. The method uses Kinect with the robust support of Point Cloud Library in 3D image processing . Robot can get precise environment’s information for motion plan. Keywords — — mobile robot; kinect; obstacle avoidance; point cloud
I. INT RODUCTION ODUCTION Given a priori knowledge of the environment and the goal position, mobile robot navigation refers to the robot’s ability to safely move towards the goal using its knowledge and sensorial information of the surrounding environment. In fact, in mobile robot operating in unstructured environment, the knowledge of the environment is usually absent or partial. Therefore, obstacle detection and avoidance are always mentioned for mobile robot missions . Obstacle avoidance refers to methodologies of shaping the robot’s path and to overcome unexpected obstacles. The resulting motion depends on the robot actual location and the sensor readings.Even though there are many different ways to avoid obstacle; most of them use ultrasonic sensorbut camera sensor is recently interested in. Kinect is not only normal camera sensor but also a special device can provide depth map.Depth mapis acquired through OpenNI librarythen processed by Point Cloud library to extract extract accu rate information information ab out the en vironment. vironment.
Fig. 1: The main components of Kinect
One of the most special Kinect’s feature is providing raw depth. The next section will present the way Kinect calculates distance to object in in front space . B. Depth calcu lation lation
The Kinect device has two cameras and one laser-based IR projector. The figure below shows their placement on the device. Each lens is associated with a camera or a projector [3].
II. KINEC INE CT AND DEPTH DEPTH MAP The main purpose of using Kinect device in this project is tore-construct 3D scenefrom depth data which is necessary information for robot motion planning. The following sections will give more details in Kinect’s components and the way it works. A. The main components of Kinect
Kinectincludes: RGB camera, 3D Depth Sensors, Multiarray array M icand Moto rized rized T ilt. ilt. RGB Camera:the RGB video stream uses 8-bit VGA resolution (640 × 480 pixels) with a Bayer color filterat filtera t a frame rate 30 Hz [1]. 3D Depth Sensors:depth data is acquired by the combination of IR Projector and IR camera [2]. Multi-array Mic:The The microphone array features four microphone capsulesand operates with each channel processing16-bit processing16-bit audio at a sa mpling rate of 16 kHz. the motorized pivot is capable Motorized Tilt: of tilting of tilting the s ensor up to 27° e ither up or down .
Fig. 2: Inside Kinect Kinect : RGB camera, IR camera camera and IR projector proj ector
The IR camera and the IR projector form a stereo pair with baseline of approximate 7.5 cm.The IR projector sends out a fixed pattern of light and dark areas shown in figure 3. The pattern is generated from a set of diffractiongratings, with special care to eliminate the effect of zero-order propagation of center brightdot. Depth is then calculated by triangulation of pattern received through IR camera againstknown pattern emitted emitted b y IR projector. This t echnology called called Light Coding Coding was developedby PrimeSense and is suitable for any indoor environment. Many current range-finding technologies uses time-of-flight to determine the distance to an object by
measuring the time it takes for a flash of light to travel to and reflect back from its surface. Light Coding uses an entirely different approach where the light source is constantly turned on, greatly reducing the need for precision timing of the measure ments [4]. [4].
Fig. 5: AKinect-likesetupwithasinglepoint-projector
Fig.3: Fixed pattern emitted by IR projector [6]
From knowledge on the emitted light pattern, lens distortion, and distance between emitter and receiver, the distance to each dot can be estimated by calculation.This is done internally in the Kinect by chip PS1080 SoC[5], and the final final de pth image is directly directly available available . Figure Figure 4 show the bas ic process o f producing scene d epth image. image.
Fig.3: Acquiring depth data process
Toillustratehowsurfacemeasurementusinga projected by projector,see projector,see figure figure 5 [7].
single
dot
It is a Kinect-likesetup in which a projector is projecting jus jus t a single point into into s pace along the greenline. greenline. It is is cap tured by the IR camera once it hits a surface. There are fourplanes in theillustration: a reference plane, a plane closer to the camera than the reference plane, a more distant plane and a projected image plane of IR camera. When the point hits a surface in the close plane it will appear further right in the image plane than if it was hitting a surface in the reference plane. Like Like wise, when the point is projected onto an object on a plane which is more distant than the reference plane, the point will appear more to the left. When theorigin and direction of the light is known in advance, and the horizontal position of the dot is known for a reference depth, then it is possible to findout the depth of the surface which the point hits basedon its horizontal horizontal po sition in theca meras image p lane. Now, the Kinect does not projectjust a single point, but a large large a mount o f points points in an intricate intricate pattern. Internally, Internally, it has an image reference of what the pattern looks like from the IR cameras viewpoint when all point in the surface are at a certain, known distance.It finds correspondences between the pattern which it captures with its cameraandthis internal reference reference pattern. By co mparing the horizontal pos ition ition of a point in the captured image to its corresponding horizontal pos ition ition in the reference image, a horizontal delta va lue can be extracted, which in turn can be used to calculate the depth of the pixel just like described in the above paragraph with the single-point projector. The Kinect itself actually does not calculate the depth, but returns a more abstract value for the hostsystem to handle. While OpenNI abstracts this away for the developer, libfreenect, another driver and library platform, ma kes thes e 11-bit 11-bit va lues available. available. Because there are fewerpixels in the IR pattern than there are in the depth map some parts of the depth map are interpolation, meaning that one cannotexpect the Kinect to bepixel-precise.Itworksbest for smooth continuous surfaces. Accordingto NicolasBurrus, whopioneered with information about the Kinect from his own experiments, the depth dep th of a point po int z can be calculated calculated in meters from from the raw disparity of the point d (as provided by theKinect hardware) us ing the followi following ng eq uation [8]:
z
QVGA_30Hz
1 -0.003 -0.003071 071101 1016 6 d + 3.3309 3.3309495 495161 161
320x240
30
B. Point Cloud
d is an 11-bit integer which ranges from 0 to 2047. We carried out tests with Kinect pointing straight at the wall on OpenNI, and found that the reliable depth values ranges from0.5 ÷ 5.0 meters. These facts mean that only d values of about 434 to 1030 represent actu al measu measu rable rable d epth v alues. Nonetheless, it suffers from the fact that for any object closer than about 0.5meter 0.5meter to the dev ice, no depth d ata is returned. III. OBSTACLE DETECTION When depth map is acquired by OpenNI, it is then processed by Point Cloud Library to extract obstacle with full of its its information. Figure 6 shows th e flow flow of process : Fig. 8: RGB point cloud
Point cloud P is a set of p i which containsvaluesto represent nD info info rmationaboutt rmationaboutt heworld(us heworld(us ually ually n = 3)[10].
p1, p2 , ..., pi , ..., pn , pi xi , yi , zi
BesidesXYZdata,eachpoint p canholdadditionalinformation, such as: RGBco RGBco lors,intensityvalues,di lors,intensityvalues,distan stan ces, segmentationresults,etc. segmentationresults,etc. To increase the process speed, point clouds are downs ampled ampled and filteredby filteredby Vo xe l Grid Grid and Pas s Throug h. C. Voxel Grid and Pass Through [11]
Fig.6: Flow of image data processing
A. Depth Map
Voxel grid generation is a kind of technique that can be used to divide the 3 dimensional space into small 3D cubeor may be 3D rectangular boxes if we use variable dimension for x,y and z axis. The main idea is it counts the number of points inside a cube of specific dimension if it is greater than a certain certain nu mber then the t otal number number o f points will be reduced to a single point, a huge reduction in dimension. The single point actually is the average of all points dimension in different direction.
Fig. 7: RGB imageand depth map
Depth map map is sup ported by OpenNI w ithdifferent ithdifferent resolution and fra fra me rate[9 ].
TABLE I RESOLUTION RESOLUTION AND FRAME RA TE SUPPORTED SUPPORTED BY OPENNI L IBRARY
SXGA_15Hz VGA_30Hz
Resolution (pixel x pixel) 1280×1024 640x480
Frame rate (fps) 15 30
Fig.9: Point cloud after Voxel Grid
Pass through will limit point clouds in X, Y, Z dimension. In our project, we pass through point clouds in Z dimension in a range fro fro m 0.5 0.5 to 1.4 met ers.
when it completes the number of iteration given before and returns the plane model model with with the mos t number of points points . The floor floor is easily easily detected by RANSAC a lgorith lgorith m with the blue point cloud in figure 12. E. Euclidean clu ster extraction: extraction:
Hình10: Hình10: Pass Th rough rough
D. Planna r segmentation segmentation
Plane detection is a prerequisite to a wide variety of vision tasks.RANdomSAmple Consensus(RANSAC) (Fischler and Bolles [12]) algorithm is widely used for plane detection in point cloud data.
Fig. 12: Point cloudsare filtered and segmented
Euclidean cluster e xtractionextracts o ther clouds clouds on th e floor floor to each cluster which presen ts for each obs tacle. tacle. F igure 12 sho ws two cluster: red red cluster is closes t obstacle to robot, and th e further one is green. F. Object clusters
In this step, clusters will be analyzed to provide the obstacles’ size. This work is supported with effectivefunctions by PCL.
Fig. 13: Object cluster
Figure 13 shows the cluster after analyzing with full of dimensions. This information is very important for robot’s motion p lan. Fig. 11: Flow chart of RANSAC algorithm[13]
Figure 11 shows the flow chart of RANSAC algorithm in finding a plane model with a set of point as input. The input values include a number of iterations and a threshold to find out which point belongs to model. It chooses randomly three points and modelizes equation of the plane as ax + by + cz + d = 0. Next, Next, RANSA RANSA C calcu calcu lates distance between all other points to the plane. If this values smaller than threshold values, it appertains to the plane model. The algorithm ends
IV. CONCLUSION Obstacle avoidance using Kinect surmounts the limitations of other methods using a single camera, such as: dependence on colors texture, obstacle in the air, etc. Besides, processing speed is acceptable for real-time application which is still a problem in us ing two cameras cameras .
REFERENCES [1] [2] [3] [4] [5] [6]
http://en.wikipedia.org/wiki/Kinect http://www.ros.org/wiki/kinect_calibration/technical AutorPráce, “Head “Head pose estimation and tracking ,” ,” p.23, p.23, 2011. MikkelViager, MikkelViager , “ Analysis of Kinect for Mobile Robots,” Robots,” Technical University of Denmark , p.11. http://www.primesense.com/en/company-profile http://www.ros.org/wiki/kinect_calibration/technical
[7]
[8] [9] [10] [11] [12] [13]
Qualitat ive Analysis o f T wo Aut Aut omated omated Registration Jacob Kjær , “ A Qualitat Algorithms In a Real World Scenario Using Point Clouds from the Kinect, ” June 27, 2011. http://nicolas.burrus.name/index.php/Research/KinectCalibration http://openni.org/Documentation/ RaduBogdan RUSU, RUSU, “PointCloud(2) processing in ROS,” ROS,” May 2, 2010. http://pointclouds.org/documentation/tutorials/ http://en.wikipedia.org/wiki/RANSAC SunglokChoi, TaeminKim, WonpilYu, “Performance Evaluationof RANSAC Family,” 2009.