Posts tagged transparency
| CARVIEW |
Superlinguo
For those who like and use language
Linguistics Data Interest Group - New RDA group to improve data citation and transparency
A really big part of linguistic research is having data. Whether it’s a language documentation corpus, a set of experiments, or your own intuitions about how language works, this is all data on which analyses and theories are built.
While linguists have always relied on data, we’re not the best at making clear where the data we are talking about come from. There has been little incentive or support for how to do that. But it is important. Making it clear where your data come from, and giving other people access to it helps make linguistic research more reproducible.
I think that it’s important that we encourage linguists to cite where their data come from, and make that data more easily accessible to other researchers. That is why I’ve joined an awesome group of researchers to create the Linguistics Data Interest Group (LDIG) as part of the Research Data Alliance. I’m a co-chair of the group (!!!) along with Andrea L. Berez-Kroeker (U Hawai‘i), Susan S. Kung (U Texas) and Helene N. Andreassen (UiT The Arctic University). If you’re a linguist who works with data (i.e. any linguist) and you think that it’s important that we strive to do the best kind of science we can, then you’re welcome to join the LDIG through joining the RDA (it’s free to join, and puts you in a really great network).
I’m especially hoping that some people who are doing their PhD or PostDoc will join, because we’re often at the front line of data collection and management.
From the LDIG announcement:
The Linguistics Data Interest Group (RDA) has been established through the Research Data Alliance (RDA) and aims to develop the discipline-wide adoption of common standards for data citation and attribution. In our parlance citation refers to the practice of identifying the source of linguistic data, and attribution refers to mechanisms for assessing the intellectual and academic value of data citations. The LDIG aims to encourage an international discussion of these topics, bolstering discussions that are already happening in specific sub-disciplines of linguistics in different countries.
The LDIG is for people who work with linguistic and language data. This work includes, but is not limited to, the collection, management and analysis of linguistic data. We encourage participation from academic and speaker communities.
You can see the LDIG draft Charter Statement on the RDA website (and leave a comment if you sign up as an RDA member).
Ethnologue launches subscription service
I’ve waited a while to post about this, but I think it’s part of a larger discussion that linguists are having about publishing and access right now, and it’s a discussion that is worth having.
Ethnologue provides a catalogue of the world’s languages. It includes population size, language vitality, location, related languages and other information about the language. The creators of the Ethnologue are also responsible for managing the ISO 639-3 codes, which are assigned to all languages to help with identification. SIL is a Christian faith-based organisation, with a focus on language documentation and description. They do not actively engage in mission work or Bible translation. Hedvig at Humans Who Read Grammars has an excellent summary of Ethnologue’s history, function within SIL and other information about this paywall event. As she notes, even though SIL do not do Bible work, their main funder Wycliffe do. This makes many non-SIL linguists uneasy about the role of SIL in language documentation.
In December SIL announced a new paywall policy in the blog post I’ve linked to at the top. Users from high income countries as defined by the World Bank will be able to view a maximum of 7 country or language pages free, before having to pay a $10/month or $60/year fee. Or their institution can pay for a subscription.
SIL concede that this paywall only affects 5% of users. Knowing how ineffective paywalling has been for commercial media over the last decade, it’s an interesting decision. Yes, all of your most basic paywall-avoiding techniques will work. Therefore SIL think that people will pay because they believe this is a useful service.
Do I really think Ethnologue is a useful service? Ethnologue has been a major undertaking, and has been an important development in mapping the world’s linguistic diversity, but the more time I spend working in language documentation and linguistic diversity the more I see its limitations. As Harald Hammarströmd notes in this review, that not a single datum is supported with citation, or the acknowledgement of contributors is a major flaw that Ethnologue could have been slowly rectifying over the last few editions, but instead now leaves the user wondering how reliable or recent the data is. There are other site that can also offer the casual user the information that they are looking for:
Mostly an aggregation of references on each language, as well as location information and alternative names. Glottolog also has its own language code system.
UNESCO Atlas of the World’s Languages
Not as much information as other sites, and not all endangered languages are equally covered, but UNSECO are moving towards a new version in the next couple of years which will hopefully bring more information online.
Managed through Google, this site gives references for all information. Users with Google accounts can update and add links to other materials.
To be honest, between these three sites, Ethnologue and the myriad of other smaller resources, sometimes following the source of information about a language feels like a snake eating its own tail. Each site has its own function and use, but they all ultimately refer to the same small set of resources with more or less transparency and easy of update.
This paywall decision from Ethnologue came at around the same time that the entire editorial board of Lingua left the journal, and the for-profit Elsevier to start Glossa, their own open access journal. The Lingua editors are based mostly in academia, rather than SIL’s charity model, but it’s interesting that they, like many, are thinking critically about pay-for-access models. We’ll be keeping an eye on both the Ethnologue and Glossa developments this year.