#lingdata

LIVE

After five years with the Linguistics Data Interest Group of the Research Data Alliance, I’ve stepped down as a co-chair of the group. I wanted to use this as a chance to collate some of the work I was involved with over the last five years. I’ll still be a member of the LDIG for its next chapter, and thrilled that Andrea Berez-Kroeker, Helene N. Andreassen, and Lindsay Ferrara will be heading things up. And, of course, the group has many excellent members (if you’re a linguist and/or interested in data management, you can join the LDIG too!).

The Austin Principles of Data Citation in Linguistics

The Austin Principles of Data Citation were the first major output for the LDIG, it focuses on the why of data citation. This short document is a position statement on the importance of data in linguistic work. From the preamble:

Data is central to empirical linguistic research. Linguistic data comes in many different forms, and is collected and processed with a wide range of methods. Data citation recognizes the centrality of data to research. Furthermore, it facilitates verification of claims and repurposing of data for other studies.

The official Austin Principles website

The Superlinguo post

The Trømso Recommendations in academic publishing

If the Austin Principles are the why the Tromsø Recommendations are the how.

TheTromsø Recommendations provide clear guidance for data citation for referencing language data, both in the bibliography and in the text of linguistics publications. The recommendations have been written to account for the rich variety of linguistic data, and include clear guidance and examples.

The official Tromsø Recommendations documents

The Superlinguo post

Building uptake for the Trømso Recommendations

Now that the Tromsø Recommendations have been published, there’s an ongoing campaign to normalise their use in academic publishing and grant writing. Get involved by encouraging your favourite publishers to include to Tromsø Recommendations in their author guidelines!

The Linguistics Data Interest Group (a working group in the Research Data Alliance) have developed the Tromsø Recommendations in collaboration with linguists working in a range of disciplines. The next step is to help encourage citation of data by encouraging journals to include the Tromsø Recommendations in their instructions for authors.

bit.ly/trecs-campaign

Publications

Alongside LDIG colleagues I was involved in a number of publications looking at the role of data in linguistics. Below are links to the Superlinguo posts with more information, the abstract and links to open access versions.

See also:The Superlinguo lingdata tag.

Suzy Styles is one of my favourite people to talk to about research data, the importance of transparency in research methods, and how we can always do things better. So when the editors of the Open Handbook of Linguistic Data Management invited us to submit a chapter about the things linguists can learn about data management from discussions in other fields of social science (particularly experimental psychology), I was so excited to sit down with Suzy and bring together everything I’ve learnt in the last few years of our discussions.

The Open Handbook of Linguistic Data Management has 56 chapters, all available as open access PDFs for you to download. Print copies are also available on a print-on-demand model. Chapters cover a range of specific case studies, and approaches, and languages. Putting this volume together has been a major effort, and we’ve been so grateful for all the work done by Andrea Berez-Kroeker, Brad McDonnell, Eve Koller and Lauren Collister. 

Alongside the handbook is a free and open online companion course that covers the first 13 chapters (including ours). The course component for our chapter includes a summary, keyword definitions, links to other chapters, activities you can do for your own work, and a revision quiz! The course page also has a colour version of the Coin Flipping Cowboys illustration that is printed in black and white in the chapter itself. (you’ll have to read the chapter to learn how these cowboys can help you think about your data!)

Opening paragraph

Linguists spend a lot of time working with data, but we do not always give much thought to the role that data plays in building the larger research culture in our field. We can learn a lot about good data management in our own discipline by learning from what is happening in related fields, both in terms of innovations and new benchmarks, as well as when things have not gone right. We look at how data have been conceptualized and managed in other areas of the social sciences, particularly social psychology, and how current attitudes are shaping the future of research. The fundamental theme of this discourse is the centrality of openness, both in terms of transparency of methodology and making primary data more accessible to people beyond the original researchers. This move toward open research aims to reduce biases, both for individual researchers and for the discipline, and encourages more considered data collection and presentation.

Reference

Gawne, L. & S. Styles. 2022. Situating linguistics in the social science data movement. In A.L. Berez-Kroeker, B. McDonnell, E. Koller & L.B. Collister (Eds), The Open Handbook of Linguistic Data Management, 9-25. MIT Press. doi: 10.7551/mitpress/12200.003.0006[Open Access PDF]

Other resources

See also:

loading