-
Notifications
You must be signed in to change notification settings - Fork 38
Adding dataset #1499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding dataset #1499
Conversation
|
Thanks for adding this list, @RebeccaLethgo. @FredericBlum and @MiraAhmedovic could you review the PR? |
FredericBlum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two small comments, looks good otherwise. The list likely is very identical to https://concepticon.clld.org/contributions/Comrie-1977-207, but uses slightly different glosses
concepticondata/conceptlists.tsv
Outdated
| Soederholm-2013-420 Söderholm, Carina and Häyry, Emilia and Laine, Matti and Karrasch, Mira 2013 420 ratings English Finnish https://doi.org/10.1371/journal.pone.0072859 Soederholm2013 This list of 420 Finnish Nouns was rated for valence and arousal by 996 native Finnish speakers, aged 16 to 77. The results were presented in sum as well as split up by age and gender. For this purpose, four groups were created in the original study, which are encoded here as GROUP_01 (16-19 years old), GROUP_02 (20-30 years old), GROUP_03 (31-59 years old) and GROUP_04 (60-77 years old). The original dataset further includes surface and lemma frequencies, as well as bigram, initgram and fintrigram frequencies for each word, which were omitted here. e72859 | ||
| Eilola-2010-210 Eilola, Tiina M. and Havelka, Jelena 2010 210 ratings English, Finnish English, Finnish https://doi.org/10.3758/BRM.42.1.134 Eilola2010 This list of 210 words contains ratings of familiarity, valence, emotionality, offensiveness, and concreteness for Finnish and British English nouns, including 34 taboo words. Ratings were provided by native speakers of each language. For British English in particular, the aim of the study was to collect data comparable to the American English norms in the Affective Norms for English Words database [(Bradley & Lang, 1999)](:bib:Bradley1999). The present mappings were based on the English words. 134-140 | ||
| Eilola-2010-210 Eilola, Tiina M. and Havelka, Jelena 2010 210 ratings English, Finnish English, Finnish https://doi.org/10.3758/BRM.42.1.134 Eilola2010 This list of 210 words contains ratings of familiarity, valence, emotionality, offensiveness, and concreteness for Finnish and British English nouns, including 34 taboo words. Ratings were provided by native speakers of each language. For British English in particular, the aim of the study was to collect data comparable to the American English norms in the Affective Norms for English Words database [(Bradley & Lang, 1999)](:bib:Bradley1999). The present mappings were based on the English words. 134-140 | ||
| Pache-2023-207 Pache, Matthias 2023 207 basic English Chibchan languages (Pech, Atanques, Damana, Ika, Kogi, Barí, Chimila, Kuna, Muisca, Tunebo , Bocotá, Guaymí, Dorasque (Dorace, Chumulu, Gualaca), Chánguena , Bribri, Cabécar, Térraba, Boruca, Guatuso, Rama) https://www.journals.uchicago.edu/doi/10.1086/722240 Pache2023 This list was used for a comparative analysis of the Chibchan languages with the aim of revising their internal genealogical classification. The author claims that the list represents the Swadesh 207 list, however, it is unclear, which list is exactly meant since Swadesh never published a 207 list. The data for the Chibchan languages was gathered from existing sources on various Chibchan languages. 81-103 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chibchan languages should probably be Chibchan instead, without listing the individual names. Perhaps the URL should be the doi url (doi.org/) instead of the journal URL?
| Pache-2023-207-73 73 louse 1392 LOUSE | ||
| Pache-2023-207-74 74 man 1554 MAN | ||
| Pache-2023-207-75 75 many 1198 MANY | ||
| Pache-2023-207-76 76 meat 2615 FLESH OR MEAT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for @AnnikaTjuka The list is probably fairly identical to Comrie-1977-207 (https://concepticon.clld.org/contributions/Comrie-1977-207), but with slightly changed glosses. We still decided to use the same mappings though, since the author indicates that this is the list he used
| Pache-2023-207-89 89 one 1493 ONE | ||
| Pache-2023-207-90 90 other 197 OTHER | ||
| Pache-2023-207-91 91 person 683 PERSON | ||
| Pache-2023-207-92 92 rain 658 RAIN (PRECIPITATION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should perhaps be RAINING OR RAIN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! In this case, I believe that the original mapping fits better because in Pache's original list, "rain" is listed in the context of nouns and not verbs (see Pache's appendix).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this list, just minor things.
concepticondata/conceptlists.tsv
Outdated
| Soederholm-2013-420 Söderholm, Carina and Häyry, Emilia and Laine, Matti and Karrasch, Mira 2013 420 ratings English Finnish https://doi.org/10.1371/journal.pone.0072859 Soederholm2013 This list of 420 Finnish Nouns was rated for valence and arousal by 996 native Finnish speakers, aged 16 to 77. The results were presented in sum as well as split up by age and gender. For this purpose, four groups were created in the original study, which are encoded here as GROUP_01 (16-19 years old), GROUP_02 (20-30 years old), GROUP_03 (31-59 years old) and GROUP_04 (60-77 years old). The original dataset further includes surface and lemma frequencies, as well as bigram, initgram and fintrigram frequencies for each word, which were omitted here. e72859 | ||
| Eilola-2010-210 Eilola, Tiina M. and Havelka, Jelena 2010 210 ratings English, Finnish English, Finnish https://doi.org/10.3758/BRM.42.1.134 Eilola2010 This list of 210 words contains ratings of familiarity, valence, emotionality, offensiveness, and concreteness for Finnish and British English nouns, including 34 taboo words. Ratings were provided by native speakers of each language. For British English in particular, the aim of the study was to collect data comparable to the American English norms in the Affective Norms for English Words database [(Bradley & Lang, 1999)](:bib:Bradley1999). The present mappings were based on the English words. 134-140 | ||
| Eilola-2010-210 Eilola, Tiina M. and Havelka, Jelena 2010 210 ratings English, Finnish English, Finnish https://doi.org/10.3758/BRM.42.1.134 Eilola2010 This list of 210 words contains ratings of familiarity, valence, emotionality, offensiveness, and concreteness for Finnish and British English nouns, including 34 taboo words. Ratings were provided by native speakers of each language. For British English in particular, the aim of the study was to collect data comparable to the American English norms in the Affective Norms for English Words database [(Bradley & Lang, 1999)](:bib:Bradley1999). The present mappings were based on the English words. 134-140 | ||
| Pache-2023-207 Pache, Matthias 2023 207 basic English Chibchan languages (Pech, Atanques, Damana, Ika, Kogi, Barí, Chimila, Kuna, Muisca, Tunebo , Bocotá, Guaymí, Dorasque (Dorace, Chumulu, Gualaca), Chánguena , Bribri, Cabécar, Térraba, Boruca, Guatuso, Rama) https://www.journals.uchicago.edu/doi/10.1086/722240 Pache2023 This list was used for a comparative analysis of the Chibchan languages with the aim of revising their internal genealogical classification. The author claims that the list represents the Swadesh 207 list, however, it is unclear, which list is exactly meant since Swadesh never published a 207 list. The data for the Chibchan languages was gathered from existing sources on various Chibchan languages. 81-103 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion for rephrasal: "The author claims that the list represents the Swadesh 207 list, however, it is unclear which list is meant exactly, since Swadesh never published a list containing 207 words." Also, the similarity to Comrie-1977-207 could be noted here.
|
Thanks, @FredericBlum and @MiraAhmedovic, for providing the reviews. @RebeccaLethgo Could you let me know when you made the changes? Then I'll do a final check. |
|
@AnnikaTjuka Thanks for moderating. I've wrapped up all the changes now. |
AnnikaTjuka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge! Just a small note for upcoming PRs: We usually use the name of the dataset for the PR, so instead of "Adding dataset" it would be "Adding Pache-2023-207" or simply "Pache-2023-207". This allows us to identify the PR quicker later on.
Pull request checklist
concepticon notlinked --gloss "NEW_GLOSS"Additional information
This pull request adds the concept word list from Pache 2023 for a comparative analysis of Chibchan languages. @AnnikaTjuka has offered to moderate and @FredericBlum has offered to review.