Rachel Wicks (@rewicks)

For anybody in the mid-atlantic region, the annual conference MASC is looking for a host next year. It's a great chance for your university to meet other researchers (and potential collaborators) in our region!

16.12.2024 21:53 👍 0 🔁 0 💬 0 📌 0

Could you give an example of the input/output you're looking for on which function call (encode, tokenize, etc)? And maybe which tokenizer it's inheriting from 😅 (looks like maybe the OPT models inherit from a GPT2Tokenizer?)

26.11.2024 19:32 👍 1 🔁 0 💬 1 📌 0

an compilation of adorable dog photos referencing a Simpson's meme ("Do it for her")

Happy to talk about any of these topics and more!

I will also likely end up talking a lot about my pride and joy (my dog).

20.11.2024 00:03 👍 1 🔁 0 💬 0 📌 0

GitHub - rewicks/ctxpro: Data and annotation toolkit for finding translation ambiguities in bitext Data and annotation toolkit for finding translation ambiguities in bitext - rewicks/ctxpro

And if you think sentence-level machine translation is good-enough, I encourage you to run your systems on our evaluation data (ctxpro, an extension to ContraPro and other similar evaluation datasets)

github.com/rewicks/ctxpro

20.11.2024 00:03 👍 1 🔁 0 💬 1 📌 0

jhu-clsp/paradocs · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Most recently I've released the ParaDocs dataset which reconstructs document annotations on large, parallel machine translation datasets. Contextual information is integral to machine translation, but often overlooked!

Data: huggingface.co/datasets/jhu...

20.11.2024 00:03 👍 0 🔁 0 💬 1 📌 0

Since we're all new here, an introduction:

I'm a final-year PhD student at Johns Hopkins University (in @jhuclsp.bsky.social working with Philipp Koehn and Matt Post.

I'm largely interested in the creation and processing of high-quality, multilingual datasets for both training and evaluation.

20.11.2024 00:03 👍 19 🔁 2 💬 2 📌 0

CLSP Join the conversation

Putting together a JHU Center for Language and Speech Processing starter pack!

Please reply or DM me if you're doing research at CLSP and would like to be added - I'm still trying to find out which of us are on here so far.

go.bsky.app/JtWKca2

19.11.2024 15:37 👍 22 🔁 10 💬 2 📌 1

Cool work by @jhuclsp colleagues Rafael Rivera Soto and Nick Andrews on how AI-generated text carries unique stylistic fingerprints, enabling the detection and identification of specific language models.

Based on ICLR paper: arxiv.org/pdf/2401.06712
hub.jhu.edu/2024/11/18/a...

19.11.2024 18:17 👍 15 🔁 4 💬 0 📌 0

Rachel Wicks

Latest posts by Rachel Wicks @rewicks