Vision API Reference
This page documents ManipulaPy.vision, the module for computer vision capabilities including stereo vision, object detection, camera calibration, and PyBullet integration with optional YOLO object detection.
Tip
For conceptual explanations, see Vision User Guide.
—
Vision Class
—
Camera Management
Configuration Methods
Extrinsic Matrix Computation
—
Image Acquisition
PyBullet Integration
—
Object Detection
YOLO Integration
—
Stereo Vision Pipeline
Rectification Setup
Image Processing
3D Reconstruction
—
PyBullet Debug Interface
Debug Slider Setup
—
Resource Management
Cleanup Methods
—
Utility Functions
—
Data Structures and Configuration
Internal Storage Format
Camera Storage (self.cameras):
self.cameras[index] = {
"name": str,
"translation": [x, y, z],
"rotation": [roll_deg, pitch_deg, yaw_deg],
"fov": float,
"near": float,
"far": float,
"intrinsic_matrix": np.ndarray(3, 3),
"distortion_coeffs": np.ndarray(5,),
"use_opencv": bool,
"device_index": int,
"extrinsic_matrix": np.ndarray(4, 4)
}
Stereo Configuration Attributes:
# Stereo processing state
self.stereo_enabled: bool
self.left_cam_cfg: dict
self.right_cam_cfg: dict
# Rectification maps
self.left_map_x: np.ndarray
self.left_map_y: np.ndarray
self.right_map_x: np.ndarray
self.right_map_y: np.ndarray
# 3D reconstruction
self.Q: np.ndarray # 4x4 disparity-to-depth matrix
self.stereo_matcher: cv2.StereoSGBM
Default Camera Configuration
When no camera_configs provided, uses:
default_config = {
"name": "default_camera",
"translation": [0, 0, 0],
"rotation": [0, 0, 0],
"fov": 60,
"near": 0.1,
"far": 5.0,
"intrinsic_matrix": [[500, 0, 320],
[0, 500, 240],
[0, 0, 1]],
"distortion_coeffs": [0, 0, 0, 0, 0],
"use_opencv": False,
"device_index": 0
}
—
Error Handling and Validation
YOLO Model Management
Loading Failure: Sets yolo_model = None, continues operation
Detection Fallback: Returns empty arrays when YOLO unavailable
Input Validation: Checks rgb_image and depth_image for None/invalid
Stereo Processing Errors
Configuration Validation: Checks required keys in stereo configs
Runtime State Checking: Validates stereo_enabled before operations
Map Initialization: Ensures rectification maps computed before use
OpenCV Device Handling
Device Access Failure: Raises RuntimeError with descriptive message
Capture Validation: Uses cap.isOpened() to verify device accessibility
Resource Cleanup: Automatic release in destructor and explicit method
PyBullet Integration Safety
Parameter Reading: Safe handling of missing debug parameters
Matrix Validation: Debug logging for view/projection matrix inspection
Client State: No assumptions about PyBullet connection status
—
Performance Considerations
Memory Management
Image Arrays: Contiguous memory layout for OpenCV operations
Rectification Maps: Persistent storage for repeated stereo processing
Point Clouds: Filtered arrays to reduce memory footprint
Computational Efficiency
YOLO Inference: Single forward pass per image
Stereo Matching: StereoSGBM optimized for quality/speed balance
Depth Scaling: Vectorized operations for range conversion
Threading Considerations
OpenCV Capture: Single-threaded device access
YOLO Processing: GPU acceleration when available
PyBullet Integration: Main thread simulation access required
—
See Also
Perception API Reference – Higher-level perception capabilities using Vision
Utils Module – Mathematical utilities for transformations
Simulation API Reference – PyBullet simulation integration
Potential Field API Reference – Obstacle avoidance using vision data
External Dependencies
OpenCV – Computer vision algorithms and stereo processing
PyBullet – Physics simulation and camera rendering
Ultralytics YOLO – Object detection framework
NumPy – Numerical array operations
Matplotlib – Debug visualization