Skip to content

Conversation

@kaseyLee123
Copy link
Contributor

Short description: Include what type of data being ingested and appropriate references.

Link to relevant issue: Closes #614

For data ingests:

  • includes script used for ingest
  • includes modified JSON files
  • Add new tests
  • Update the Versions table

@kaseyLee123 kaseyLee123 self-assigned this Jun 18, 2025
try:
filtered_results = results[(results["ab_flags"] == '00') & (results["cc_flags"] == '0000')][0]
writer.writerow([
filtered_results[0],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include the db source name in the CSV file. I think it's source["source"]

Comment on lines 71 to 73
source_num+=1
print(source_num)
logger.warning("no source match found")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also make a CSV file of the sources with no match.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, add counters. Count how many sources have multiple matches, 1 match, and no match.

Comment on lines 56 to 65
filtered_results[1],
filtered_results[2],
filtered_results[3],
filtered_results[4],
filtered_results[7],
filtered_results[8],
filtered_results[9],
filtered_results[10],
filtered_results[11],
filtered_results[12]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there's a better way to convert this list into a comma delimited string.

results = Irsa.query_region(coordinates=coord, spatial='Cone', catalog='catwise_2020', radius=0.5 * u.arcmin, columns="source_name,PMRA,sigPMRA,PMDec,sigPMDec,ab_flags,cc_flags,w1mpro,w1sigmpro,w2mpro,w2sigmpro,ra,dec")

try:
filtered_results = results[(results["ab_flags"] == '00') & (results["cc_flags"] == '0000')][0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no particular motivation for picking the first one. If there are multiple matches which pass these flags, you should keep them. Might as well save them to the CSV file. I would save all matches that pass these filters. If there are sources with multiple matches, save them to a DIFFERENT CSV file and we can look at them individually.

Comment on lines 45 to 50
for source in sources:
#create skycoord object because one of the parameters for query region for position
coord = SkyCoord(ra = source["ra"], dec = source["dec"], unit = "deg", frame = "icrs")

# generates a list of objects from the catwise2020 catalogs that are within this radius of a certain position/coordinate
results = Irsa.query_region(coordinates=coord, spatial='Cone', catalog='catwise_2020', radius=0.5 * u.arcmin, columns="source_name,PMRA,sigPMRA,PMDec,sigPMDec,ab_flags,cc_flags,w1mpro,w1sigmpro,w2mpro,w2sigmpro,ra,dec")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop over results instead of sources.

@kaseyLee123
Copy link
Contributor Author

I keep getting this error that I think means the query_region doesn't take in multiple values/whole columns (?)
Traceback (most recent call last): File "/Users/kasey/Documents/GitHub/SIMPLE-db/scripts/ingests/catwise/create_matches_csv.py", line 29, in <module> results = Irsa.query_region(coordinates=coord_vector, spatial='Cone', catalog='catwise_2020', radius=0.5 * u.arcmin, columns="source_name,PMRA,sigPMRA,PMDec,sigPMDec,ab_flags,cc_flags,w1mpro,w1sigmpro,w2mpro,w2sigmpro,ra,dec") File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/astropy/utils/decorators.py", line 618, in wrapper return function(*args, **kwargs) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/astroquery/ipac/irsa/core.py", line 219, in query_region response = self.query_tap(query=adql) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/astroquery/ipac/irsa/core.py", line 73, in query_tap return self.tap.search(query, language='ADQL', maxrec=maxrec) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/tap.py", line 282, in run_sync **keywords).execute() ~~~~~~~^^ File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/tap.py", line 1121, in execute return TAPResults(self.execute_votable(), url=self.queryurl, session=self._session) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/adhoc.py", line 111, in __init__ super().__init__(votable, url=url, session=session) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/query.py", line 338, in __init__ raise DALQueryError(self._status[1], self._status[0], url) pyvo.dal.exceptions.DALQueryError: UsageFault: BAD_REQUEST: Invalid or unsupported ADQL query string. See TAP documentation here: https://irsa.ipac.caltech.edu/docs/program_interface/TAP.html

Comment on lines 29 to 30
results = Irsa.query_region(coordinates=coord_vector, spatial='Cone', catalog='catwise_2020', radius=0.5 * u.arcmin,
columns="source_name,PMRA,sigPMRA,PMDec,sigPMDec,ab_flags,cc_flags,w1mpro,w1sigmpro,w2mpro,w2sigmpro,ra,dec")
Copy link
Collaborator

@kelle kelle Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gives me a warning saying this:
WARNING: AstropyDeprecationWarning: "verbose" was deprecated in version 4 and will be removed in a future version. [main]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deprecation warning is ok. Does it provide any additional output when you set it to True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traceback (most recent call last): File "/Users/kasey/Documents/GitHub/SIMPLE-db/scripts/ingests/catwise/create_matches_csv.py", line 29, in <module> results = Irsa.query_region(coordinates=coord_vector, spatial='Cone', catalog='catwise_2020', radius=0.5 * u.arcmin, columns="source_name,PMRA,sigPMRA,PMDec,sigPMDec,ab_flags,cc_flags,w1mpro,w1sigmpro,w2mpro,w2sigmpro,ra,dec", verbose=True) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/astropy/utils/decorators.py", line 618, in wrapper return function(*args, **kwargs) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/astroquery/ipac/irsa/core.py", line 219, in query_region response = self.query_tap(query=adql) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/astroquery/ipac/irsa/core.py", line 73, in query_tap return self.tap.search(query, language='ADQL', maxrec=maxrec) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/tap.py", line 282, in run_sync **keywords).execute() ~~~~~~~^^ File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/tap.py", line 1121, in execute return TAPResults(self.execute_votable(), url=self.queryurl, session=self._session) File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/adhoc.py", line 111, in __init__ super().__init__(votable, url=url, session=session) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/simple-db/lib/python3.13/site-packages/pyvo/dal/query.py", line 338, in __init__ raise DALQueryError(self._status[1], self._status[0], url) pyvo.dal.exceptions.DALQueryError: UsageFault: BAD_REQUEST: Invalid or unsupported ADQL query string. See TAP documentation here: https://irsa.ipac.caltech.edu/docs/program_interface/TAP.html

I think this means its still doesn't take in whatever I passed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's not giving us any more useful information. Let me try to get the astroquery folks to help us....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've chatted with astroquery developers and It seems like this is a real bug. I've opened an issue here: astropy/astroquery#3360

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice it says "A Table can also be used to specify the coordinates in a region query if it contains the columns _RAJ2000 and _DEJ2000" but do these columns exist in sources? Or are these just the equivalent of RA and DEC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaseyLee123
Copy link
Contributor Author

When I add the db source in the file, since I wouldn't be able to do source["source"] anymore I would have to use find_source_in_db right? I remember doing this for my original script where everything was in one file and there were a few sources that didn't exist. In this case do I ingest it? or collect these sources somewhere else

@kelle
Copy link
Collaborator

kelle commented Jun 27, 2025

When I add the db source in the file, since I wouldn't be able to do source["source"] anymore I would have to use find_source_in_db right? I remember doing this for my original script where everything was in one file and there were a few sources that didn't exist. In this case do I ingest it? or collect these sources somewhere else

I see what you're saying - if you query IRSA with a vector of coords, you don't have easy access to the SIMPLE source name. I would just ignore this for now.

In this case, you are comparing everything in SIMPLE to CatWISE. There should be no cases where you need to ingest a new source into SIMPLE. BUT, there will be cases where there is no match in CatWISE to the SIMPLE source. And yes, we'd like to keep track of those. But it depends on how we end up doing the search....

Comment on lines 34 to 35
results = Irsa.query_region(coordinates=coord_vector, spatial='Cone', catalog='catwise_2020', radius=0.5 * u.arcmin,
columns="source_name,PMRA,sigPMRA,PMDec,sigPMDec,ab_flags,cc_flags,w1mpro,w1sigmpro,w2mpro,w2sigmpro,ra,dec", verbose=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this to using vizier instead of IRSA.

@kaseyLee123
Copy link
Contributor Author

Screenshot 2025-07-05 at 16 13 57

I just pushed my code that ran. Numbers seem a little off for what we expected (I think like 890 for sources not matched).

@kaseyLee123
Copy link
Contributor Author

I'm also realizing I probably shouldve made a csv for the skipped sources since that can include the case where the photometry error is not availiable or its a duplicate. I also did not ingest propermotions yet because I wasn't sure what to leave for the propermotions source since there is no column for that but in the commented out one I just did Maro21 for everything since that is the paper from catwise 2020 I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ingest CATWISE data

2 participants