I built a GitHub issue classifier for Apache Arrow issue language using {ellmer} - super simple and almost 100% accuracy. Blog post: niccrane.com/posts/llm-issue-triage/
#rstats #ai #llms
I built a GitHub issue classifier for Apache Arrow issue language using {ellmer} - super simple and almost 100% accuracy. Blog post: niccrane.com/posts/llm-issue-triage/
#rstats #ai #llms
Or roughly 0.08 Friedman Units
en.wikipedia.org/wiki/Friedma...
βnot one of the recommendations was a new idea to NCES,β said Peggy Carr. βMany had already been implemented or we were working on when the center was dismantled."
Great reporting by @jillbarshay.bsky.social hechingerreport.org/proof-points...
We released DuckDB v1.5!
This release comes with a βfriendly CLIβ client, a new (opt-in) PEG parser, support for VARIANT types and many lakehouse features. It also ships a new network stack, a reworked geospatial extension, Azure writes and an ODBC scanner.
Read more at duckdb.org/2026/03/09/a...
The sky is not falling; high-quality platforms (Prolific, Verasight, CR Connect) have low rates of apparent bots. osf.io/preprints/ps... But also not zero; vigilance is very much needed!
R-Ladies branded graphic with purple-to-blue gradient background. The heading reads "Our Programs" in white. Six program cards are arranged in a 2-by-3 grid: Mentoring (connecting new chapter organizers with experienced leaders), RoCur (rotating curation on Bluesky spotlighting community voices), Abstract Review (80+ reviewers helping members submit to conferences), Community Slack (a safe space for connecting, learning, and sharing), Blog (tutorials, stories, and career journeys from our community), and YouTube (event recordings and learning materials).
R-Ladies is more than meetups. Our programs:
π€ Mentoring β pairing new organizers with experienced ones
π£ RoCur β rotating curation on Bluesky
π Abstract Review β 80+ reviewers for conference submissions
π¬ Community Slack
π Blog & YouTube
rladies.org
#RLadiesIWD2026
Happy that I just released my first #quarto extension: quarto-envelope βοΈ
I Had 100+ birth announcements letters to write, so I built a Quarto/Typst extension to generate print-ready PDFs programmatically from R.
I hope it will be helpful for the #rstats community !
github.com/Felixmil/qua...
Live and lapply()
The sapply() who loved me
A Quarto of Solace
i love data, me too meme
Just learned about the delightful R package βfcukβ to help users correct typos while coding:
cran.r-project.org/web/packages...
Usually I can find something to appreciate and treat it as a learning experience. Like when I first had to use Python I enjoyed learning about comprehensions and itertools. It helps counterbalance the ick from things like Pandas or overstuffed Jupyter notebooks.
Itβs usually easy but sometimes it gets stressful to make the short turnaround time to address CRAN check warnings/notes or else have your package archived.
With very large numbers of nβs you donβt need randomization, and with LLMβs we can generate very large numbers of nβs, so I think all of science is solved by now. I donβt see any problems with this.
If only AI / ML had been around when I was training, I wouldnβt have had to learn about things like causal inference, how to evaluate prediction models or even, say, the importance of data quality. What a waste of time all that was!
Screenshot of both sides of the printable version of the cheatsheet
Screenshot of the web version of the recipes cheatsheet
#tidymodels now has its very first cheatsheet! "Preprocessing data with {recipes}" is now available in Web and PDF versions here: rstudio.github.io/cheatsheets/... #rstats #posit #rstudio
I just learned that Ayatollah Khamenei and Ayatollah Khomenei are not the same person. Here's my plan for regime change in Iran....(1/23)
There's a moment in every data engineer's career when they discover they can query a 10GB Parquet file on their laptop in seconds.
That's the DuckDB moment.
It changes how you think about what requires a cluster and what doesn't. Spoiler: most things don't.
ssp.sh/blog/enterp...
β¦β¦. Deep cut
THERE IS ONLY ONE TRUE WAY TO CODE AND IT IS TIDY. All others will perish on the altar of messiness. MUAHAHAHAHAAAAAAAAAAAAAA
The more I learn about #rstats the more excited I get. We have a rich ecosystem of tools / libraries such as #shiny or @quarto.org that I honestly feel like I can do anything
There's tremendous opportunity in corporations to improve and transform their workflow and reporting capabilities.
thatβs a big selling point for weighted bootstraps (and things like Fayβs method), so that you donβt get a bad bootstrap sample that breaks your model
DC district court has denied the Department of Education's motion to dismiss our case challenging IES's termination of four research studies, its peer review program, and restricted data use application processing!
ecf.dcd.uscourts.gov/cgi-bin/show...
spopt-r brings powerful spatial optimization algorithms for regionalization, facility location, and market analysis to R with a blazing-fast Rust backend.
Use them for analyses in energy, retail, logistics, sales, real estate, and more.
Get started: walker-data.com/spopt-r
Starting a job posting thread in the survey industry. First, off Pew on their methods group
Exciting news! We just posted an opening for a Survey Associate on @pewresearch.org's Methods team! This is an amazing opportunity for someone relatively early in their career to join what is, IMO, the most fun methods team in the business. Full description at the link below.
This piece is open access, and if you write survey questions, you should read it.
The whole dinner scene is amazing. Every time I watch it Iβm just howling over the rhetorical questions βsnacks?!!β and βdid you see our show?!!β