A comprehensive Python package for Euclid archival data analysis, designed for use within the ESA Datalabs environment.
euclidkit facilitates advanced data exploration and visualization for Euclid Q1/(I)DR1 archival releases, including:
- Data Access: Query and crossmatch sources with the Euclid MER catalogue
- Spectroscopic Analysis: Access, download, and combine NISP spectra of archival sources
- Unified Workflow: Streamlined tools for researchers working with Euclid spectroscopic data
The package is designed for efficient archive querying and Euclid spectrum compilation workflows.
- Python 3.11+
- Access to ESA Datalabs environment (for data volumes)
- COSMOS credentials for Euclid archive access
pip install euclidkitgit clone https://github.com/rudolffu/euclidkit.git
cd euclidkit
pip install -e .Store credentials in a private file under your home directory and restrict permissions:
mkdir -p ~/.euclidkit
touch ~/.euclidkit/.cred.txt
chmod 600 ~/.euclidkit/.cred.txtEdit ~/.euclidkit/.cred.txt manually with your preferred editor (do not put credentials in shell history).
Use two lines:
- COSMOS username
- COSMOS password
Create and edit the user config file:
euclidkit init-config --output ~/.euclidkit/euclidkit_config.yaml --template basicThen edit ~/.euclidkit/euclidkit_config.yaml and set the credential path.
Set the credential path in the config:
data:
credentials_file: /home/<user>/.euclidkit/.cred.txt# Note: the Python import path is currently still `euclidkit`.
from euclidkit.core.data_access import EuclidArchive
# Initialize archive connection
archive = EuclidArchive(environment='PDR')
archive.login()
# Crossmatch your sources with Euclid MER catalogue
results = archive.crossmatch_sources(
user_table="my_sources.csv",
radius=1.0, # arcseconds
output_file="crossmatch_results.fits"
)
# Query for available spectra
spectra_table = archive.query_spectra_sources(
crossmatch_table=results,
output_file="spectra_sources.fits"
)
# Combine spectra into a single FITS file
combined_file = archive.combine_spectra_to_fits(
spectra_table=spectra_table,
output_file="my_combined_spectra.fits"
)# Crossmatch user table with Euclid MER catalogue
euclidkit crossmatch \
--input my_sources.csv \
--output crossmatch_results.fits \
--radius 1.0 \
--verbose
# Submit the entire table as a single async job (no batching). The output file
# will contain TAP job metadata instead of immediate crossmatch results.
euclidkit crossmatch \
--input my_sources.csv \
--output crossmatch_results.fits \
--full-async
# When using the IDR environment the command defaults to the WIDE field and
# writes results to wide_<filename>. Use --idr-field DEEP to query the deep stack:
euclidkit crossmatch \
--input my_sources.csv \
--output crossmatch_results.fits \
--environment IDR \
--idr-field DEEP# Upload a FITS table to your Euclid TAP workspace
euclidkit upload-table \
--input my_sources.fits \
--table-name my_workspace_table \
--description "Sources awaiting deep crossmatch" \
--overwrite
# Upload CSV data as-is (format inferred automatically)
euclidkit upload-table \
--input trimmed_sources.csv \
--table-name trimmed_sources# Query spectra from crossmatch results
euclidkit query-spectra \
--crossmatch crossmatch_results.fits \
--output spectra_sources.fits \
--verbose
# Query spectra by object IDs and auto-combine
euclidkit query-spectra \
--object-ids 123456,789012,345678 \
--output spectra_sources.fits \
--combine-output my_spectra.fits \
--max-spectra 100 \
--verbose# Build Cutana CSV from a source table with object_id or ra/dec columns
euclidkit query-cutana \
--sources my_sources.fits \
--output cutana_input.csv \
--instrument VIS \
--cutout-size arcsec \
--cutout-size-value 15
# NISP example with explicit filters
euclidkit query-cutana \
--sources my_sources.fits \
--output cutana_input_nisp.csv \
--instrument NISP \
--nisp-filters NIR_Y,NIR_H \
--environment IDR \
--idr-field DEEP \
--cutout-size arcsec \
--cutout-size-value 15# Compile individual spectra into chunked FITS files
euclidkit compile-spectra \
--spectra-table spectra_sources.fits \
--output-dir ./output \
--prefix compiled_spectra \
--max-extensions 1000 \
--verboseNote: for canonical compilation from local Datalabs FITS volumes, --workers 2 is often not faster due to shared-storage I/O contention. Prefer --workers 1 unless benchmarking on your setup shows a clear gain.
- Multiple Environments: Support for PDR, IDR, OTF, and REG archive environments
- Efficient Queries: Batch processing with TAP table uploads for large datasets
- Crossmatching: Position-based matching with configurable search radius
- Spectrum Access: Direct access to Euclid data volumes on ESA Datalabs
- FITS Compilation: Combine individual spectra into multi-extension FITS files
- Metadata Preservation: Maintain source IDs, coordinates, and provenance information
- Quality Control: Spectrum validation and quality assessment
This package is optimized for the ESA Datalabs environment with direct access to:
- Euclid Q1 Data:
/data/euclid_q1/(35 TB volume)
Main interface to the Euclid science archive.
archive = EuclidArchive(environment='PDR')
archive.login(credentials_file='~/.euclidkit/.cred.txt')
# Crossmatch sources
results = archive.crossmatch_sources(
user_table="sources.csv",
radius=1.0,
output_file="results.fits"
)
# Query spectra
spectra = archive.query_spectra_sources(
crossmatch_table=results,
output_file="spectra.fits"
)
# Get individual spectrum
spectrum_hdu = archive.get_individual_spectrum(
datalabs_path="/data/euclid_q1/path",
file_name="spectrum_file.fits",
hdu_index=42
)
# Combine spectra
combined = archive.combine_spectra_to_fits(
spectra_table=spectra,
output_file="combined.fits",
max_spectra=1000
)Advanced spectrum compilation with chunking support.
from euclidkit.core.spectra import SpectrumCompiler
compiler = SpectrumCompiler(max_extensions=1000)
# Compile into chunked files
output_files = compiler.compile_spectra(
spectra_table=spectra_table,
output_dir="./output",
output_prefix="compiled_spectra"
)
# Create single FITS file
single_file = compiler.compile_single_fits(
spectra_table=spectra_table,
output_file="all_spectra.fits"
)
# Generate metadata table
metadata = compiler.create_metadata_table(
spectra_table=spectra_table,
output_files=output_files,
output_dir="./output"
)from euclidkit.core.data_access import EuclidArchive
from euclidkit.core.spectra import SpectrumCompiler
import pandas as pd
# 1. Initialize archive
archive = EuclidArchive(environment='PDR')
archive.login()
# 2. Load your QSO candidates
qso_candidates = pd.read_csv('qso_candidates.csv')
# 3. Crossmatch with Euclid MER catalogue
crossmatches = archive.crossmatch_sources(
user_table=qso_candidates,
radius=2.0, # 2 arcsecond radius
output_file='qso_crossmatches.fits'
)
# 4. Find available spectra
spectra_sources = archive.query_spectra_sources(
crossmatch_table=crossmatches,
output_file='qso_spectra_sources.fits'
)
print(f"Found {len(spectra_sources)} spectra for {len(crossmatches)} crossmatches")
# 5. Create combined FITS file (for small samples)
if len(spectra_sources) <= 1000:
combined_spectra = archive.combine_spectra_to_fits(
spectra_table=spectra_sources,
output_file='qso_combined_spectra.fits'
)
print(f"Combined spectra saved to: {combined_spectra}")
# 6. Or use chunked compilation for large samples
else:
compiler = SpectrumCompiler(max_extensions=2000)
output_files = compiler.compile_spectra(
spectra_table=spectra_sources,
output_dir='./spectra_chunks',
output_prefix='qso_spectra'
)
print(f"Created {len(output_files)} chunked files")
archive.logout()Check your installation and environment:
# Check all components
euclidkit diagnostics
# Check specific components
euclidkit diagnostics --check-deps --check-data- PDR: Public Data Release
- IDR: Internal Data Release (only accessible to Euclid Consortium members)
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
For detailed documentation and examples, visit:
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: fuympku@outlook.com
Yuming Fu (@rudolffu)
- Email: fuympku@outlook.com
- GitHub: https://github.com/rudolffu/euclidkit
This project is licensed under the GNU General Public License - see the LICENSE file for details.
- ESA Euclid Mission and Euclid Consortium
- ESA Datalabs and Euclid Data Space infrastructure team
- Astropy and astroquery communities
- Spectroscopic Pipeline: Complete pipeline for accessing and combining Euclid spectra
- CLI Integration: Added
--combine-outputoption toquery-spectracommand - TAP Upload: Improved query performance using TAP table uploads
- FITS Compilation: Efficient multi-extension FITS file creation
- Error Handling: Robust handling of long filenames and missing data
See CHANGELOG.md for detailed version history.