TurkuNLP (@turkunlp)

DGfS 2026 | Universität Trier

Researchers from TurkuNLP attended the workshop “Linguistic patterns of textual organization across register” at DGfS conference in Trier, Germany (Feb 25–27). "One of the best conferences I've been to!" said one! @unitrier.bsky.social 😎👍
www.dgfs2026.uni-trier.de #UniTrier #UTU

03.03.2026 09:13 👍 0 🔁 0 💬 0 📌 0

Väitöskirjatutkijamme Amanda Myntti tutkii, miten tekstien tietyt kielelliset ominaisuudet vaikuttavat kielimallien piirteisiin.

Hänen mukaansa on olennaista kiinnittää huomiota siihen, millaisella datalla kielimalli on koulutettu.

👉 Lue Myntin puheenvuoro: www.utu.fi/fi/ajankohta...

28.01.2026 12:59 👍 10 🔁 3 💬 0 📌 0

Koneoppimismallit oppivat tunnistamaan erilaisia tekstilajeja niiden kielellisten piirteiden perusteella. Tämän havaitsi FM Liina Repo, jonka väitöskirja tarkastetaan pe 30.1.

Lämpimät onnittelut väittelijälle! 🎩
#väitös #tutkimus #tiede #tekoäly #koneoppiminen #AI #tekoäly
tinyurl.com/2kn83usj

27.01.2026 13:04 👍 13 🔁 1 💬 0 📌 0

We are glad to have two visiting scholars, Karim Hemina and Florian Frenken, to share their expertise! Today we got to listen their presentations on topics "Fake news detection on social networks" and "Dynamic text structure across online registers - A geometric multivariate approach".

13.01.2026 13:36 👍 9 🔁 2 💬 0 📌 0

“This compute allocation will allow the project to continue and expand its efforts to build the next generation of fully open LLMs for all European languages.” TurkuNLP member Sampo Pyysalo, technical lead in OpenEuroLLM.

13.01.2026 07:41 👍 1 🔁 0 💬 0 📌 0

AI-gaala 2025 finalistit julki – tässä ovat Suomen tekoälykentän kärkinimet ja käyttötapaukset - AI Finland 17.11. järjestettävä AI-gaala tuo yhteen 500 yritysjohtajaa, asiantuntijaa ja innovaattoria juhlistamaan tekoälyn merkittäviä saavutuksia ja tulevaisuuden mahdollisuuksia Suomessa. Kansainvälisen AI S...

TurkuNLP member Risto Luukkonen's MSc thesis has been selected as one of three contenders for the best AI thesis of the year by AI Finland! 🎉The winner will be announced in the AI Gala next week. aifinland.fi/ai-gaala-202...

12.11.2025 12:23 👍 1 🔁 0 💬 0 📌 0

FAIR Science Café FAIR Science Café is a informal and interactive online event where researchers get to talk about their work, research, and results in their own words. The discussion also highlights the data used, …

FAIR Science Café @csc.fi is an interactive online event where researchers present their work highlighting data used or produced. On Nov 21. Tomasz Galica @turkunlp.bsky.social talks about developing and evaluating LLMs, training datasets and risks. Info & sign up: www.dariah.fi/event/fair-s...

12.11.2025 09:35 👍 2 🔁 1 💬 0 📌 0

Our experts contributed to the latest #HPLT dataset publication, which contains some very interesting results! See here: t.co/uN2zoSF251 #DataScience

06.11.2025 14:47 👍 3 🔁 2 💬 0 📌 0

Turun yliopistoon uusi Virpi Lummaan johtama huippuyksikkö Suomen Akatemia valitsi uudeksi huippuyksiköksi Virpi Lummaan johtaman yksikön, jossa ovat mukana Veronika Laippala, Päivi Onkamo ja Outi Vesakoski.

Suomen Akatemia valitsi uudeksi huippuyksiköksi Virpi Lummaan johtaman ihmisen monimuotoisuutta tutkivan yksikön (2026-2033), jossa ovat mukana Veronika Laippala, Päivi Onkamo ja Outi Vesakoski. Tutkimuksen huippuyksiköt kuuluvat oman tieteenalansa kansainväliseen kärkeen. www.utu.fi/fi/ajankohta...

31.10.2025 08:20 👍 4 🔁 0 💬 0 📌 0

Doctoral students from TurkuNLP together with people from DigiTS Tartu are planning a workshop on presentation skills specifically for DH researchers! We are grateful for #TurkuUniversityFoundation for the Villa Tammekann grant for hosting the upcoming workshop next autumn. ♥️ Looking forward to it!

27.10.2025 12:26 👍 1 🔁 0 💬 0 📌 0

TCBLex - A lexical database of Finnish literary texts for children - Behavior Research Methods This work introduces TCBLex, a lexical database of Finnish literary works read by children between the ages of 7 and 15. We explain in detail the work done to build the corpus TCBLex is based on, incl...

(Nojonen, Korsu, Ginter, Laippala & Kanerva 2025) introduce TCBLex, a lexical database of Finnish literary works read by children (7-15y). Data consists of 14 sub-lexicons and over 11 million tokens, annotated and lemmatized.
Paper: link.springer.com/article/10.3...
Data: doi.org/10.5281/zeno...

20.10.2025 08:48 👍 2 🔁 1 💬 0 📌 0

Two articles by TurkuNLP members have been published in a book about the linguistic landscape of Turku, except that (Kupari & Lamberg 2025) and (Ristilä 2025) have turned the tables and observed the "landscape in language". The book is available for free online here: oa.finlit.fi/books/e/10.2...

13.10.2025 07:09 👍 2 🔁 0 💬 1 📌 0

Our Doctoral Researcher Otto Tarkka (@ottotarkka.bsky.social) visited CSC facilities in Kajaani last month on a trip organized by FIN-CLARIAH. "It was great to meet new people and hear how CSC computers are used in a wide variety of research projects."

13.10.2025 06:52 👍 3 🔁 1 💬 0 📌 0

Our Latin expert, Hanna-Mari Kupari, presented at the Norwegian Institute in Rome on "Latin Across Registers: A Computational Analysis of Situational Language Use Reflected in Grammar". See the slides and abstract here:
github.com/HannaKoo/Nor...

29.09.2025 10:32 👍 2 🔁 0 💬 0 📌 0

Teimouri, Kanerva & Ginter (2025) published insights for model interpretability in their study of a multi-attention head model, showing that heads capture distinct semantics and deeper layers enhance separation but pooling can blur patterns: acl-bg.org/proceedings/...

29.09.2025 07:22 👍 2 🔁 0 💬 0 📌 0

Maryam from TurkuNLP participated in #RANLP2025 (Recent advances in Natural Language Processing) and their team won a competition where they were to create a solution for a hate speech classifier for 5 low resource languages. 🏆Congrats!

15.09.2025 12:31 👍 2 🔁 0 💬 0 📌 0

Miksi Eurooppa tarvitsee omia kielimalleja tekoälyn aikakaudella, tutkija Sampo Pyysalo?

Miksi Eurooppa tarvitsee omia kielimalleja tekoälyn aikakaudella, tutkija Sampo Pyysalo?

27.04.2025 09:41 👍 4 🔁 2 💬 1 📌 0

Tapio Salakoski ja Filip Ginter lähikuvassa. Kuvan päällä lukee: Tiedelinja: Onko tekoäly suuri mahdollisuus vai kohtalokas virhe?

Mitkä ovat tutkijoidemme suurimmat toiveet ja pahimmat pelot tekoälyn suhteen?

🎧 Kuuntele Tiedelinja-podcastimme uusin jakso, jossa data-analytiikan professori Filip Ginter ja vararehtori Tapio Salakoski keskustelevat tekoälystä.

👉 Kuuntele Tiedelinja-podcastia: www.utu.fi/fi/ajankohta...

24.01.2025 08:43 👍 25 🔁 6 💬 0 📌 0

TurkuNLP was at Corpus Linguistics Conference 2025! #CL2025 Pictures of some of our participants by Hanna-Mari Kupari and Jiaqi Guo. Search the book of abstracts for "University of Turku" to read more about our contributions: drive.google.com/file/d/1TiwO... Thank you @cl2025.co.uk!

08.07.2025 09:19 👍 4 🔁 2 💬 0 📌 0

OpenEuroLLM A series of foundation models for transparent AI in Europe

TurkuNLP leads the central work package on building LLMs within OpenEuroLLM.
openeurollm.eu/blog/LUMI-Ex...

30.05.2025 06:59 👍 2 🔁 2 💬 0 📌 0

Our recent paper on the impact of register (genre) on LLM performance. Key points: news do poor in evaluation, while opinionated texts are among the best. We hope this work can be used to understand the impact of register on LLMs and improve training data mixes! arxiv.org/abs/2504.01542

15.04.2025 12:57 👍 5 🔁 1 💬 0 📌 1

TurkuNLP is now on Bluesky! 🎉

15.04.2025 11:31 👍 8 🔁 1 💬 0 📌 0

TurkuNLP

Latest posts by TurkuNLP @turkunlp