Norm Matloff (你有冇諗清楚呀?)'s Avatar

Norm Matloff (你有冇諗清楚呀?)

@matloff

Em. Prof., UC Davis. Various awards, incl. book, teaching, public service. Many books, latest The Art of Machine Learning (uses qeML pkg). Former Editor in Chief, the R Journal. Views mine. heather.cs.ucdavis.edu/matloff.html

633
Followers
978
Following
1,600
Posts
05.11.2024
Joined
Posts Following

Latest posts by Norm Matloff (你有冇諗清楚呀?) @matloff

Post image

While working full-time & raising a child, Grace Wahba earned advanced degrees, completing her PhD at Stanford and becoming the first female faculty member in statistics at the University of Wisconsin-Madison. #womenshistorymonth #statwomen magazine.amstat.org/blog/2026/03... #statssky

09.03.2026 13:19 👍 22 🔁 11 💬 1 📌 1

Actually, on the Databot page, Joe Cheng has a nice essay explaining his own concerns, largely similar to mine.

Unfortunately, I can't try it out yet. As of now at least, one apparently needs a paid Claude account. I only use free LLMs.

09.03.2026 19:56 👍 0 🔁 0 💬 0 📌 0

Actually I think it is subtly even worse than asking what test to do. And since I believe one should never do tests, that's pretty bad! Passive is passive, no two ways about it.

08.03.2026 20:45 👍 0 🔁 0 💬 0 📌 0

As a computer scientist who still finds computing just as exciting as I did many eons ago, Databot looks really cool! But as a statistician who finds users of statistics understand what they're doing less and less these days, I find this app highly concerning. Data analysis should not be automated.

08.03.2026 20:03 👍 0 🔁 0 💬 1 📌 0

Complex Surveys: A Guide to Analysis Using R by Thomas Lumley
#RStats
https://bigbookofr.com/chapters/social%20science.html#complex-surveys-a-guide-to-analysis-using-r

07.03.2026 12:40 👍 3 🔁 2 💬 1 📌 0

Thanks, interesting point about SQL.

BTW, my original post was in terms of absolute numbers. I would expect the relative share of dplyr to grow relative to data.table and base R even without aspects like SQL. See onlinelibrary.wiley.com/share/author... for why that raises concerns.

07.03.2026 17:51 👍 2 🔁 0 💬 0 📌 0

The data.table package is indispensable for those who need high performance data operations. So it's not going away any time soon; on the contrary, the number of such applications is growing rapidly.

07.03.2026 03:12 👍 5 🔁 0 💬 1 📌 0

It is much easier to understand python code for data science if you know a bit of base R than if you only know tidyverse.

#rstats #python

04.03.2026 19:24 👍 16 🔁 1 💬 4 📌 1

This is a perfect example for anyone who has wanted to get into LLM code assistance but wasn't sure how to get started. It is fully worked out.

Note how Frank completely specified what he wanted ChatGPT to do. The LLM can't guess what you want.

But remember, "Trust but verify." :-)

28.02.2026 18:00 👍 0 🔁 0 💬 0 📌 0

Haven't heard of such a debate, but I don't hear a lot of stuff.

28.02.2026 02:59 👍 1 🔁 0 💬 0 📌 0

That IS neat. Good excuse to become familiar with 'builder'.

27.02.2026 01:19 👍 3 🔁 0 💬 0 📌 0

Keep in mind that even Hadley wrote "It's hard to wrap one's head around" using functions instead of writing loops. So why on Earth are we forcing it on noncoder R learners? Even someone reading here who knows nothing about coding would feel that it's a very troubling question. 2/2

26.02.2026 20:49 👍 0 🔁 0 💬 0 📌 0

The post by David Robinson, a major "R influencer," that I cited earlier in this thread is pretty militant.

I do worry about the overall negative impact that teaching Tidy has on noncoder R learners, which has always been the focus of my criticism; I don't care what "R adults" do :-) . 🧵 1/

26.02.2026 20:43 👍 2 🔁 0 💬 1 📌 0

Heh, heh, if you don't like the word "dogma," don't get me started on the word "cult." :-)

26.02.2026 20:34 👍 1 🔁 0 💬 0 📌 0

I'm not sure the original version of that book made a qualifying statement like this. But in any event the postponement until Chapter 27 sends a huge message, as does the false claim that the book's view represents consensus across the R community.

26.02.2026 20:20 👍 0 🔁 0 💬 0 📌 0

Actually, in pure numbers, the group is rather small, just a few people. But one "person" in that group is a commercial entity with huge influence.

26.02.2026 20:16 👍 0 🔁 0 💬 0 📌 0

My original post made exactly this point, right?

26.02.2026 20:13 👍 0 🔁 0 💬 1 📌 0

You seem to think I am opposed to FP. I use it often. Among other things, I do this because I am lazy (e.g. don't have to set up temporary vectors).

26.02.2026 19:23 👍 0 🔁 0 💬 1 📌 0
Preview
Teach the tidyverse to beginners A few years ago, I wrote a post Don’t teach built-in plotting to beginners (teach ggplot2). I argued that ggplot2 was not an advanced approach meant for experts, but rather a suitable introduction to ...

varianceexplained.org/r/teach-tidy...

26.02.2026 19:20 👍 0 🔁 0 💬 1 📌 0

:-) Thanks for the chuckle, but it illustrates my point about a commercial entity dominating an open source product.

26.02.2026 17:45 👍 1 🔁 0 💬 0 📌 0

A point I make in my new article, onlinelibrary.wiley.com/share/author..., is that while RStudio has done some great things (I am a big Quarto fan), it is always dangerous to have a commercial entity dominate an open-source product. RS took a wrong turn with Tidy, and R suffers for it IMO. 8/8

26.02.2026 17:40 👍 0 🔁 0 💬 0 📌 0

In the next couple of years, a combination of JJ's keen business acumen and Hadley's charismatic speaking style, Tidy and RStudio really took off. Tidy was pitched as "R for noncoders," and though IMO it's terrible for noncoders, the pitch really worked. 7/

26.02.2026 17:36 👍 1 🔁 0 💬 1 📌 0

On the contrary, we had several discussions re under which circumstances a non-loop's speed would make it worthwhile. But after he joined RS, he began to think hard about how R code "should" be written, resulting in Tidy. 6/

26.02.2026 17:29 👍 0 🔁 0 💬 1 📌 1

Here is the broader picture. Hadley was hired into RStudio as a recent PhD graduate. I was writing my R book at the time, and he asked to be the internal reviewer. He had not yet formulated Tidy, and was not anti-loop. 5/

26.02.2026 17:26 👍 0 🔁 0 💬 1 📌 0

He writes this as though there is a consensus on this, which of course is not true, but no wonder Julia @dingdingpeng.the100.ci took that for granted! 4/

26.02.2026 17:21 👍 0 🔁 0 💬 1 📌 0

"If you know much about iteration in other languages, you might be surprised that we didn’t discuss the for loop. That’s because R’s orientation towards data analysis changes how we iterate..." 3/

26.02.2026 17:17 👍 0 🔁 0 💬 1 📌 0

The picture changed with Hadley. In his R for Data Science book, he does not cover loops until Chapter 27. And in Chapter 26, he writes, 2/

26.02.2026 17:15 👍 0 🔁 0 💬 1 📌 0

What R books (including my own) did in those days was present the *apply functions as OPTIONS, not general advice, for situations in which extra speed was needed. 🧵 1/

26.02.2026 17:12 👍 0 🔁 0 💬 3 📌 0

No offense taken. Not tidyverse but part of the tidyverse dogma. Actually, it would be difficult to define the tidyverse.

26.02.2026 13:28 👍 2 🔁 0 💬 1 📌 0
Redirect to Skeptic.html

The word "crusade" is a bit strong, don't you think? Most of my posts don't involve the tidyverse.

Bu I do strongly believe that teaching the tidyverse is harmful to beginning coders matloff.github.io/TidyverseSkeptic. As a teacher, I also object to teaching dogma to innocents, hence my post here.

25.02.2026 22:38 👍 4 🔁 0 💬 2 📌 0