We release MarsScapes, the first panorama dataset for Martian terrain understanding. The dataset contains 195 panoramas of Mars surface with fine-grained annotations for semantic and instance segmentation, facilitating high-level scene understanding of Martian landforms and further enhancing the navigability of rovers over rough terrains in large areas. Note: Limited by the file size, we temporarily submit the first half of MarsScapes (i.e. from 122_1 to 527_1) as a representative subset. All samples will be provided soon.
For segmentation performance of learning-based methods on this dataset, please check our papers:
To characterize all landforms on Mars and label all pixels without omission, we define nine categories, including soil, sand, gravel, bedrock, rocks, tracks, shadows, background and unknown. We give specific descriptions and examples of each category in the supplementary.pdf.
The raw Mars images are courtesy of NASA/JPL-Caltech. You can read the full use policy here. We pick out 3379 images that meet our criteria and employ PtGui software to splice them into 195 panoramas.
we adopt PixelAnnotationTool, a manual annotation tool that uses watershed algorithm in OpenCV, which reduces part of our workload by automatically separating two adjacent terrains with high contrast. To store the annotation data in a desirable JSON format, we rewrite the create_poly_json.py file of the software.
Currently, you can download MarsScapes by the following ways, we will soon release more efficient ways to get them.
MarsScapes or by Baidu Cloud with passcode: masc
The data file structure of MarsScapes and the JSON format of a sample are shown in the following figures.
The image folder contains 195 panoramic RGB images, whose widths range from 1230 to 12062 pixels and heights from 472 to 1649 pixels. Each image is stored with the naming convention <Sol_num>.png, where Sol denotes the number of days Curiosity has traveled on Mars and num represents the number of panoramas.
In the semantic folder, <Sol_num>_color.png is the visualization of semantic annotations for 9 categories and it is converted into a single-channel <Sol_num>_semanticId.png for semantic segmentation research. Different from the semantic annotation of terrain classes, all instances of the same class are distinguished in <Sol_num>_instanceId.png, which can be used in instance segmentation research. In addition, <Sol_num>_polygon.json provides a human-readable text format for annotations. Here we show panorama images, semantic segmentation annotations and instance segmentation annotations of three samples in MarsScapes.
The processed folder contains pre-processed images for evaluating supervised learning and UDA methods. The panorama samples of MarsScapes facilitate data augmentation to obtain a more diverse terrain distribution, which is crucial for promoting UDA performance. Referring to the SkyScapes dataset, we crop panoramas and corresponding annotation images into 512 × 512 sub-images with 50% overlap between adjacent patches in both the horizontal and vertical directions. After flipping horizontally, we obtain 13618 images for the source domain and 7184 for the target domain.
To evaluate the data volume of our MarsScapes dataset, we compare it with SkyScapes, a panoramic image dataset of urban infrastructure, shown in the following table.
| Dataset | Classes | Panoramic images | Sub-images for evaluating | Image size | Annotated pixels |
|---|---|---|---|---|---|
| SkyScapes | 31 | 16 | 17640 | 5616×3744 | 3.36×108 |
| MarsScapes | 9 | 195 | 20802 | Widths:1230∼12062 Heights: 472∼1649 | 3.92×108 |
In terms of the number of annotated pixels, the two datasets have similar data volume. Processed by the same methods mentioned above, SkyScapes contains 17640 images and MarsScapes 20802 images for evaluating. Although the Martian terrain is not as diverse as the urban infrastructure of SkyScapes, the annotation of MarsScapes requires more labor for the following reasons:
-
SkyScapes is a dataset collected in structured environment, where the boundary of an instance can be described by regular line segments. Under unstructured environment like Mars surface, however, the boundary of a terrain is mostly irregular and blurred;
-
Most semantic segmentation datasets are collected on the Earth, where most objects can be distinguished by color and shape. On Mars, however, the colors of the various terrains are so similar that labeling is mainly based on inconspicuous texture, such that the annotations of MarsScapes requires manual efforts instead of the assistance of annotation software;
-
The classification of an unstructured terrain relies on its relationships with neighboring areas, which requires us to comply with more complex annotating standards.
In conclusion, MarsScapes provides enough samples with fine-grained annotations for training learning-based methods, thus contributing to autonomous navigation of rovers on Mars.
If you use MarsScapes in your research, please cite our papers:
@article{liu2023hybrid,
title={A hybrid attention semantic segmentation network for unstructured terrain on Mars},
author={Liu, Haiqiang and Yao, Meibao and Xiao, Xueming and Cui, Hutao},
journal={Acta Astronautica},
volume={204},
pages={492--499},
year={2023},
publisher={Elsevier}
}
@article{liu2023marsscapes,
title={Marsscapes and udaformer: A panorama dataset and a transformer-based unsupervised domain adaptation framework for martian terrain segmentation},
author={Liu, Haiqiang and Yao, Meibao and Xiao, Xueming and Zheng, Bo and Cui, Hutao},
journal={IEEE Transactions on Geoscience and Remote Sensing},
volume={62},
pages={1--17},
year={2023},
publisher={IEEE}
}









