UpskilLing
Language data scientist
As a language data scientist you normally conduct research on language data, and/or business processes and market needs. You are mostly involved with problem solving, through asking questions, obtaining the relevant data, exploring the data, statistically modelling the data and communicating the results.
In the following sections, you will find more details about what you need to know to effectively perform previously presented tasks. Please have in mind that maintaining a strong foundation of disciplinary, (inter)cultural and transversal skills is equally indispensable for success across all listed tasks and responsibilities.
This means that you are able to use linguistic data for doing research, framing questions through appropriate research design, thinking analytically and processing information critically
This means that you have some experience in collecting linguistic data (e.g. from corpora or from speakers, in an interview or an experiment), in curating them and storing them following a set of standards, and in analysing them (especially quantitatively)
This means that you know about different types of language technologies, such as CAT tools, machine translation engines and ChatGPT, and you know not only how to use them, but also have some knowledge of how they work; you also have some programming skills, e.g. in Python
This means that you have some familiarity with entrepreneurship, you are able to manage different kinds of projects, plan their implementation, implement quality assurance procedures and ensure that teams work well together towards a common goal
Language data analyst
Language data collection, annotation, analysis
Language data manager
Language data cleaning, curation, management
Language project manager
Language project and workflow coordination