README.md Required packages: opencv-contrib-python==3.4.8.29 ffmpeg-python numpy os matplotlib
- Chaerim Moon
(a) Image undistortion (reference - https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html) By using the coefficients and intrisic/extrinsic camera matrices obtained from camera calibration, the image can be rectified. Since the left camera was rotated upside-down, when saving the calibrated frames, the frames from the left camera were rotated by 180 degrees.
(b) Ball detection (reference - https://learnopencv.com/blob-detection-using-opencv-python-c/) For ball detection, a color channel mask and openCV SimpleBlobDetector are used. The color channel mask finds the area with the color channel thresholds. The openCV SimpleBlobDetector was set with the parameters of area and circularity. Specifically, among the area within a defined range, it searches for a candidate which satisfies the circularity condition. The thresholds for the mask and the parameters of the SimpleBlobDetector were set manually while checking the results.
(c) Object's 3D position estimation with a single camera For single-camera position estimation, using the intrinsic matrix, the 2D position in the image plane was converted. Then, the obtained values refer to 1/w[X; Y; Z], where w is a coefficient. The coefficient w was calculated to make the distance from the ball center and ball top positions the actual radius of the ball. With the intrinsic/extrinsic camera matrices and the coefficient, the detected ball center was converted to the world coordinate in 3D - the world coordinate was defined by setting the left camera as a reference.
- Jongwon Lee
(a) We implement the Kalman filter to estimate the smooth states of the ball position, using 3D positions obtained from each camera (measurements) in prior steps. Our goal was to create a robust Kalman filter capable of handling scenarios where measurements from either camera may be unavailable, resulting in 'intermittent zero measurements'. This was achieved by developing a Kalman filter that treats the sets of 3D positions from each camera as two sets of measurements. We observed that this approach effectively smoothes the estimation of the ball position, even when it is not directly attainable from single-camera estimates.
cam1: left camera cam2: right camera
- folders test2_cam#: include the raw image files from the hardware experiment (f00000#.png) test2_cam#_calib: include the rectified image files (f00000#_calib.png) test2_cam#_ball: include the image files with a circle on the detected ball (f00000#_ball.png) ball_3d: include the result csv and video files
The frame number from all the image and csv files are consistent. e.g., f01000.png, f01000_calib.png, f01000_ball.png, and the # of rows in the csv files are for the same frame
- files in ball_3d ball_blob_cam#.csv: the detected blob position and its size, in the order of blob position (x, y) and its size in the image plane ball_3d_cam#.csv: estimated 3D ball positions in the world coordinate, in the order of x, y, and z x-axis: left(-)/right(+), y-axis: up(-)/down(+), z-axis: inward(+)/outward(-) the unit is m ball_3d_cam#.mp4: the result video with the test2_cam#_ball image files with the estimated 3D position written in the top-left corner
ball_3d_fused.csv: the result of fused data
Current issues:
- The ball is detected smaller than the real size due to the bright lighting on the top part, and it results in overestimation of the depth.
- When occlusion starts occurring, it fails to detect the center of ball correctly.
- potentially use the blob size to detect when it starts incorrect detection
- I would suggest to set a weight to the frames from each camera to use the correct information only
- The 3D position estimation results from each camera have offsets - probably from the calibration error?