This project consists of two parts:
- HololensV2: The Unity project containing the Hololens application
- EmbodiedCoTAria: The backend part containing the code for transforming the Aria VRS data to RLDS format, and the code for training based on our improve E-CoT pipeline.
For a explanation of the project itself, visit our Website.
The repository provides tools and scripts for converting raw VRS (Aria Glasses' native data format) files into structured datasets, extracting sensor data, and preparing data for downstream machine learning tasks. It supports dataset visualization, transformation, and integration with RLDS formats.
- VRS to Numpy/JSON Conversion: Scripts to extract and convert sensor data from VRS files.
- Dataset Transformation: Utilities to transform and standardize datasets for ML pipelines.
- Visualization: Tools for visualizing sensor data and dataset statistics.
- Hand, Gaze, and Speech Processing: Utilities for extracting and processing hand tracking, gaze, and speech data.
Contains the main dataset building and transformation utilities.
- aria_dataset/: Core dataset processing code.
aria_dataset_dataset_builder.py: Main builder for converting and structuring datasets.conversion_utils.py: Helper functions for data conversion.create_example_data.py: Script to generate example data.peek_npy.py: Utility to inspect.npyfiles.
- example_transform/: Example scripts for dataset transformation.
- test_dataset_transform.py, visualize_dataset.py: Scripts for testing and visualizing datasets.
- environment_macos.yml, environment_ubuntu.yml: Environment setup files for different OSes. Note that the numpy version from this environments must be used to create the .npy files, which are then converted into RLDS.
Handles raw VRS files and their conversion.
- vrs_to_npy.py: Converts VRS files to Numpy format.
- extract_sensor_calibration.py: Extracts calibration data from VRS files.
- dataprovider_quickstart_tutorial.ipynb: Tutorial for data extraction and usage.
- vrs_data/, vrs_data_1/: Raw VRS files and extracted data directories.
- rlds_data/: RLDS-formatted datasets for training/validation.
- extracted/: Output from extraction scripts.
General-purpose utilities for data processing and analysis.
- aria_to_rlds.py: Converts Aria data to RLDS format.
- gaze_utils.py, hand_tracking_utils.py: Utilities for gaze and hand tracking data.
- process_fisheye_hands_voice.py: Scripts for processing hand and voice data.
- vrs_to_hdf5.py: Converts VRS data to HDF5 format.
This part of the README describes the HoloLens-specific portion of the Ego Tutor repository. It covers the Unity project targeting HoloLens 2 using OpenXR, with C# scripts and ShaderLab shaders to deliver interactive mixed reality experiences.
- Target device: HoloLens 2
- Runtime: OpenXR (Unity XR Plugin Management)
- Optional framework: MRTK (Mixed Reality Toolkit) for interactions and UX
- Languages used: C# (game logic, interactions), ShaderLab (visual effects)
The HoloLens portion is organized as follows (folder names may vary slightly depending on your project layout):
Assets/Scenes/— Unity scenes for MR experiences (e.g., demo environments, sample interactions)Scripts/— C# scripts for input, interaction, spatial mapping, and application logicPrefabs/— Reusable configured objects (UI elements, interactables, anchors)Materials/— Materials used by scene objectsShaders/— ShaderLab files for custom rendering or visual effectsTextures/— Static textures used in materials or UIMRTK/(optional) — Mixed Reality Toolkit packages and profiles if MRTK is usedResources/— Assets loaded at runtime (profiles, configuration data)Plugins/— Native or managed plugins (OpenXR, platform integrations)
ProjectSettings/— Unity project settings, including XR, input, and capabilitiesPackages/— Unity package references (OpenXR, XR Management, MRTK, etc.)