-
Notifications
You must be signed in to change notification settings - Fork 3
Mapper unit testing #530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Mapper unit testing #530
Conversation
ef46449 to
afe89bb
Compare
|
@amywieliczka Here are the pre-mapping methods (rikolti__pre_mapping) returned by the registry, by mapper type, for each currently-implemented mapper on {
"oac.oac": ["/select-oac-id"],
"flickr.sppl": ["/select-id?prop=id"],
"flickr.sdasm": ["/select-id?prop=id"],
"flickr.flickr": ["/select-id?prop=id"],
"oai.samvera": ["/select-id?prop=id"],
"oai.omeka": [],
"oai.csu_dspace": ["/select-id?prop=id"],
"oai.omeka.csa": [],
"oai.omeka.nothumb": [],
"oai.oai": [],
"oai.quartex": ["/select-id?prop=id"],
"oai.chapman": ["/select-id?prop=id"],
"oai.islandora.chs": ["/select-id?prop=id"],
"oai.islandora.burbank": ["/select-id?prop=id"],
"oai.content_dm.csudh": ["/select-id?prop=id"],
"oai.content_dm.pepperdine": ["/select-id?prop=id"],
"oai.content_dm.blackgold": ["/select-id?prop=id"],
"oai.content_dm.arck": ["/select-id?prop=id"],
"oai.content_dm.cvpl": ["select-id?prop=id"],
"oai.content_dm.lapl": ["/select-id?prop=id"],
"oai.content_dm.chico": ["/select-id?prop=id"],
"oai.content_dm.contentdm": [],
"oai.content_dm.csu_sac": ["/select-id?prop=id"],
"oai.tv_academy": ["/select-id?prop=id"],
"oai.csu_dspace.csuci": [],
"oai.cca_vault": ["/select-id?prop=id"],
"oai.up": ["/select-id?prop=id"],
"oai.pspl": [],
"oai.islandora": ["/select-id?prop=id"],
"oai.yosemite": ["/select-id?prop=id"],
"nuxeo.nuxeo": ["/select-id?prop=uid"],
}At quick glance, it looks like it's pretty consistent - there are some OAI mappers that don't have any pre-mapping, but those that do are all the same. |
|
@amywieliczka Ok, here is the full dataset for all 55 defined mapper types. Format is:
{
"oac_dc": {
"/select-oac-id":[2,7,9,10,14,42,44,45,48,49,51,52,53,54,55,56,57,58,59,60]
},
"ucd_json": {
"":[8,3378,3696,5656,8918,9757,10503,27202,27275,27306,27307,27548,27685,28047,28048,28049,28050,28051,28052,28053]
},
"dublin_core": {
"/select-id?prop=id":[11,13,75,101,123,124,125,126,127,130,131,133,134,135,149,151],
"/select-id": [32, 156, 9075, 11353],
},
"ucldc_nuxeo": {
"/select-id?prop=uid":[19,20,21,22,27,28,29,30,31,36,38,40,50,65,66,68,69,76,78,79]
},
"ucsb_aleph_marc": {"/ucsb-aleph-marc-id": [113]},
"ucb_tind_marc": {
"/select-id?prop=id":[150,747,1454,2515,3461,3928,4442,5058,5636,6083,6200,7497,7875,8776,8980,9673,10373,10425,11167,11237]
},
"ucsc_oai_dpla": {
"/select-id?prop=id":[153,154,157,159,4877,5014,5105,5191,12661,16530,18580,23349,27086,27156,27371,27736,27737,27738,27739,27740]
},
"ucsd_blacklight_dc": {
"/select-id?prop=id,/jsonfy-prop":[172,173,174,178,183,184,186,516,1092,1412,2497,3475,4155,5020,5344,5938,8540,8844,9652,10125]
},
"csa_omeka": {
"/select-id?prop=id":[232,3297,5445,8476,14210,17012,18384,19562,20936,24852,26673,27142,27277,27278,27279,27280,27281,27282,27283,27284]
},
"ucla_solr_dc": {"/select-id?prop=PID": [354, 17210, 26092, 26093]},
"quartex_oai": {
"/select-id?prop=id":[2206,2287,2487,2658,6137,10527,20666,23760,26212,26213,26214,26215,26216,26217,26218,26220,26486,26487,26488,26489]
},
"sjsu_islandora": {
"/select-id?prop=id":[3155,6406,7523,7855,8736,9615,9825,11725,13181,17161,19008,19256,19264,19303,20698,25340,26574,26575,26598,26599]
},
"cca_vault_oai_dc": {"/select-id?prop=id": [3433, 3767, 26391, 26470]},
"chs_islandora": {
"/select-id?prop=id":[3501,6124,26726,26727,26923,26924,26925,26926,27504,27505,27506,27507,27508,27509,27510,27511,27512,27513,27514,27515]
},
"contentdm_oai_dc": {
"/select-id?prop=id":[4425,5401,7955,8235,9844,13762,14885,15063,15934,16161,16887,16888,16908,17466,18106,18980,19404,19611,23030,24195]
},
"preservica_api": {
"/select-preservica-id":[4789,5861,7157,7562,9160,10207,11143,11367,15686,18469,21804,22816,24251,24415,25302,26462,26464,26670,26671],
"/select-id?prop=id": [26460],
},
"black_gold_oai": {
"/select-id?prop=id":[4891,26865,26866,26867,26868,26869,26870,26896,26897,27844]
},
"omeka_santa_clara": {"/select-id?prop=id": [5138, 27798]},
"calpoly_oai_dc": {
"/select-id?prop=id":[9670,9674,10423,14360,16958,18308,19436,20425,26796,26797,26798,26799,26800,26801,26802,26803,26804,26805,26806,26807]
},
"csu_sac_oai_dc": {"/select-id?prop=id": [11068, 26723, 27968]},
"csudh_contentdm_oai_dc": {
"/select-id?prop=id":[14167,26203,26397,26398,26399,26400,26401,26402,26403,26944,27288]
},
"chula_vista_pl_contentdm_oai_dc": {
"select-id?prop=id": [18598, 26477, 26480, 26481, 26482]
},
"lapl_oai": {
"/select-id?prop=id":[26094,27219,27220,27221,27222,27223,27224,27225,27226,27227]
},
"sfpl_marc": {"/sfpl-marc-id": [26095]},
"lapl_26096": {"/select-id?prop=id": [26096]},
"ucsf_solr": {"": [26099, 26100, 28029, 28030, 28031, 28032, 28033, 28034]},
"cavpp_islandora": {
"/select-id?prop=id":[26198,26225,26226,26227,26228,26229,26230,26231,26232,26233,26235,26236,26237,26239,26240,26241,26242,26243,26244,26245]
},
"up_oai_dc": {
"/select-id?prop=id":[26224,26309,26310,26311,26312,26313,26314,26315,26316,26317,26318,26319,26320,26321,26323,26324,26325,26326,26849,26850]
},
"chapman_oai_dc": {
"/select-id?prop=id":[26284,26285,26286,26287,26288,26289,26290,26291,26292,26293,26294,26295,26296,26297,26328,26329,26330,26331,26332,26333]
},
"califa_oai_dc": {"/select-id?prop=id": [26483]},
"csl_marc": {"/csl-marc-id": [26559]},
"contentdm_oai_dc_get_sound_thumbs": {"/select-id?prop=id": [26684]},
"pspl_oai_dc": {
"/select-id?prop=id": [26707, 26715, 26716, 26717, 26718, 26719, 26720, 26721]
},
"omeka": {
"/select-id?prop=id":[26724,26725,27121,27122,27140,27144,27145,27147,27149,27150,27165,27433,27618,27619,27620,27621,27622,27623,27624,27625]
},
"chico_oai_dc": {
"/select-id?prop=id": [26762, 26763, 26764, 26765, 26766, 26768, 26769]
},
"ucb_bampfa_solr": {"/select-id?prop=id,/jsonfy-prop": [26772]},
"islandora_oai_dc": {
"/select-id?prop=id":[26773,26774,26775,26776,26777,26778,26779,26885,26951]
},
"flickr_api": {"/select-id?prop=id": [26857, 27125, 27316]},
"youtube_video_snippet": {
"/select-id?prop=id":[26858,27015,27022,27023,27024,27029,27030,27081,27162,27163,27274,27296,27360,27419,27420,27421,27422,27423]
},
"csuci_mets": {"/select-id?prop=id": [26928]},
"pastperfect_xml": {
"/select-id?prop=metadata/identifier": [26935, 27166, 27429, 27434, 27566]
},
"caltech_restrict": {"/select-id?prop=id": [26952, 27118]},
"usc_oai_dc": {
"/select-id?prop=id":[26988,27087,27089,27090,27091,27092,27093,27094,27095,27096,27097,27098,27099,27100,27101,27102,27103,27104,27105,27106]
},
"yosemite_oai_dc": {"/select-id?prop=id": [26989, 27016, 27631, 27632, 27633]},
"emuseum_xml": {"/select-id?prop=metadata/identifier": [27011]},
"csu_dspace_mets": {"/select-id?prop=id": [27085]},
"sierramadre_marc": {"/csl-marc-id": [27127]},
"burbank_islandora": {
"/select-id?prop=id":[27128,27129,27130,27131,27132,27133,27134,27135,27136,27137,27138]
},
"omeka_nothumb": {"/select-id?prop=id": [27143, 27146, 27148]},
"sanjose_pastperfect": {"/select-id?prop=metadata/identifier": [27289]},
"tv_academy_oai_dc": {"/select-id?prop=id": [27305]},
"flickr_sdasm": {"/select-id?prop=id": [27324, 27424, 27425, 27426, 27427]},
"flickr_sppl": {
"/select-id?prop=id":[27338,27362,27363,27364,27365,27366,27367,27373,27374,27375,27376,27377,27378,27379,27380,27381,27382,27383,27384,27386]
},
"internet_archive": {"/select-id?prop=identifier": [27369, 27616]},
"arck_oai": {
"/select-id?prop=id":[27564,27582,27583,27584,27585,27586,27587,27588,27589,27590,27591]
},
} |
12aa322 to
31ef66b
Compare
31ef66b to
73c967a
Compare
|
@timfrazee wow, this is really solid work. @barbarahui and I sat down today to take a good look at things and have these questions:
Overall, I'm wondering if we can find some way to concretize this. Flattening the mapper class inheritance model would at least reduce one axis in that matrix and is definitely the most appealing to me. We already know that there is no namespace collision in our leaf mappers. Incidentally, I'm honestly wondering if it doesn't make sense to flatten the mapper inheritance tree as well (though that will likely come after launch sometime). Each time we make updates to base mappers (such as the base oai mapper), we have to re-run validation reports to every single collection that uses that base mapper because they're all implicated in the change. As this code base evolves, it will have to be a best practice to only update leaf mappers in order to reduce implicating all the other data in the system, unless we're ready to implicate (and validate) everything. I know you've done a bit more analysis of the OAI mappers specifically since you last worked on this, so I'd be curious to hear your thoughts regarding the requirements for our fixture generators grounded in some analysis. Probably worth taking a look at (or talking to Lucas about) another tree of mappers as well. All in all, this use of faker to populate schema definitions of the incoming vernacular metadata seems like a great approach. |
706e98b to
f82fd09
Compare
|
@amywieliczka @barbarahui I've implemented a few improvements and, currently, a functional ContentDM mapper test.
pytest metadata_mapper/test --mapper=contentdm
pytest metadata_mapper/test --mapper=contentdm_mapper # same as above - `_mapper` is optional
pytest metadata_mapper/test --mappers=contentdm,cca_vault # `mappers` arg can take mutiple comma-delimited mapper names
def setup_mocks(self):
url = "https://example.com"
response = { "success": True }
self.request_mock.register_uri("GET", url, json=response)See
Next up, I'm working on the |
* Todo: more test helpers needed * Currently fakes record.legacy_couch_db_id
* Exceptions in mapper init are now reported * pytest failures due to exceptions include backtrace * Fixes pytest short circuiting on mapper init
9c67522 to
f622e03
Compare
5664e9b to
d833812
Compare
b68eff9 to
3bd6b7c
Compare
Implements infrastructure for mapper unit smoke tests.
test_mapperiterates over all implemented mappers, builds data fixtures for them, and runsto_UCLDC(), expecting no exceptionstest_{mapper_name}_mapperintest_mapper.py. The generic mapper test will not run for this mapper.