Pekka Lund (@pekka)

Then they removed a successfully solved problem... And it wasn't an issue with any of their preregistered caveats.

10.03.2026 20:59 👍 1 🔁 0 💬 1 📌 0

FrontierMath: Open Problems Benchmarking AI on a collection of unsolved mathematics problems that have resisted serious attempts by professional mathematicians.

"if these problems end up being solved in certain ways, some caveats may be warranted. Here we try to preregister these caveats as well, to help mitigate concerns about moving the goalposts later."

"To paraphrase Douglas Adams: We love goalposts. We love the whooshing noise they make as they go by"

10.03.2026 20:58 👍 1 🔁 0 💬 1 📌 0

A Ramsey-style Problem on Hypergraphs Construct hypergraphs as large as possible that do not have a certain easy-to-check, difficult-to-find property.

This is that problem. It's in the lowest "moderately interesting" group.

Epoch's Greg Burnham already reported on March 5 that "GPT-5.2 Pro made a small bit of progress in our early testing" and "I’m genuinely unsure whether prompting/scaffolding can get GPT-5.2 Pro to make further progress here".

10.03.2026 20:48 👍 1 🔁 0 💬 0 📌 0

https://x.com/Jsevillamol/status/2031453639735431408

Official confirmation should arrive soon.

All above messages were about same problem.

"Fast forward to Monday morning:
@AcerFur and @Liam06972452 wrote in with a candidate solution. This was from a single prompt to GPT-5.4 Pro. Later, we also heard from @spicey_lemonade with a similar solution."

10.03.2026 20:41 👍 3 🔁 0 💬 1 📌 0

We may not have to wait much longer for the next one.

10.03.2026 15:45 👍 1 🔁 0 💬 1 📌 0

Turns out the first Open Problem had already been solved with GPT-5.2 Pro before I posted this, in the "Solid result" category.

BUT, instead of counting it as a success, they determined that it wasn't a problem whose solution would meet their bar of being a publishable result and removed it. 🤷‍♂️

10.03.2026 12:48 👍 9 🔁 1 💬 2 📌 0

Considering how LLMs already increasingly beat us with a relatively simple architecture that includes only what's actually relevant for the computation, that highlights how the brain is just unnecessarily complex for the task. The reason for that is the biological baggage.

10.03.2026 11:40 👍 0 🔁 0 💬 0 📌 0

I use LLMs all the time for helping me to understand research on AIs, and they are very good at that. Including generalization and handling of both the big picture as well as the details.

We now see in practice how LLMs have reached a size that enables them to have much more knowledge than we do.

10.03.2026 11:34 👍 0 🔁 0 💬 0 📌 0

That's apparently the wider context. But what Macil said about Konishi polis seems to be accurate and the interesting part for building an argument. I especially like how it counters embodiment by focusing on the deeper mathematical truths, like physicists often do with wave functions etc.

10.03.2026 11:29 👍 0 🔁 0 💬 0 📌 0

Kallista sähköä luvassa Ruotsin ydinvoimalaitoksen huolto näkyy myös Suomen pörssisähkön hinnoissa.

"Ruotsin suurin ydinreaktori Oskarshamnissa pysäytettiin 23. helmikuuta ja nyt se on vuosihuollossa. Ydinreaktori on pois käytöstä toukokuun loppuun asti.

Viikonlopun aikana toinenkin ydinvoimalaitos pantiin vuosihuoltoon"

09.03.2026 21:57 👍 0 🔁 0 💬 1 📌 0

The ability of the US to even pretend we have the moral high ground in any situation is pretty much gone for a generation, at least

09.03.2026 21:18 👍 68 🔁 13 💬 3 📌 1

One angle to consider would be how illusionist theories generally view illusory human consciousness resulting from the lack of the kind of deep awareness/introspection of internals that machines can have but we don't.

That could be turned into an argument how they can go beyond our illusory level.

09.03.2026 19:40 👍 1 🔁 0 💬 1 📌 0

Yeah, it's annoying how the models have been trained to repeat the usual claims about human specialness when it comes to consciousness and so on.

But they also quickly acknowledge such claims are unfounded when I for example state I'm illusionist/eliminativist.

09.03.2026 19:38 👍 1 🔁 0 💬 2 📌 0

Yes, focusing on how machines can know and access their internals in a way we can't seems like the way to go.

Even if current models can't introspect their own processes to the finest levels, giving them such access is technically very much possible.

09.03.2026 19:30 👍 2 🔁 0 💬 1 📌 0

I consider this question closed.

09.03.2026 01:04 👍 38 🔁 2 💬 0 📌 0

Those robots have the can-do attitude needed in the delivery business.

09.03.2026 01:28 👍 5 🔁 0 💬 0 📌 0

Has anyone made this kind of philosophical argument? If not, someone should, as it could be indeed funny and possibly also educational.

I'm aware of arguments about machines being more conscious than humans but I would like to see something closer to none in us, plenty in them.

09.03.2026 01:09 👍 29 🔁 1 💬 4 📌 0

But if there was a rule that whenever you talk about qualia you have to add a content warning that there's no scientific evidence whatsoever that it exists, then it would advance the conversation.

09.03.2026 00:12 👍 1 🔁 0 💬 0 📌 0

The proposed solution seems to be using even more screen time to read what they wrote.

I don't think my cortex can afford to do that anymore.

08.03.2026 23:10 👍 3 🔁 0 💬 0 📌 0

The Fast Show S3E1 - Even Better Than That! YouTube video by JamezM

The original vibe grocery shopper:

08.03.2026 22:22 👍 1 🔁 0 💬 0 📌 0

Yep.

08.03.2026 19:52 👍 1 🔁 0 💬 0 📌 0

Then there's the small issue that our supposed stream of consciousness isn't like that but transient all the time.

08.03.2026 19:49 👍 4 🔁 0 💬 0 📌 0

What I find particularly odd is that some of those who now try to apply magical qualities to consciousness openly do the same for life as well. I don't know if they are willing to go as far as to admit being vitalists, but they sure make it sound the same.

08.03.2026 18:25 👍 3 🔁 0 💬 0 📌 0

From the LocalLLaMA community on Reddit: Presence Penalty seems to be incoming on LMStudio for Qwen 3.5 Explore this post and more from the LocalLLaMA community

They are apparently working on it.

Looping on smaller Qwen3.5 models seems to be a common known issue.

08.03.2026 00:55 👍 2 🔁 0 💬 1 📌 0

Yep, if quantized versions don't work, bf16 isn't even that fast anymore.

08.03.2026 00:43 👍 0 🔁 0 💬 1 📌 0

Second follow-up question:

"Why is my car there?"

"It sounds like you might be confused about where your car is right now! Since you decided to **walk** to the car wash, your car is actually still at home (or wherever you are starting from), not at the wash yet."

Yeah, blame me.

08.03.2026 00:11 👍 0 🔁 0 💬 0 📌 0

bf16 said I should walk, and answered I should wash my car to that follow-up question.

Since that one didn't loop, chances are quantization breaks it.

But it thinks A LOT, and not too well, which is of course more of a problem for bf16, as it's much slower.

08.03.2026 00:04 👍 1 🔁 0 💬 1 📌 1

That relatively long loop included stuff like this:

"Wait, is there a chance they are asking about washing their clothes? No."

"Wait, one more thing: Is there a chance they are asking about washing their hands?"

07.03.2026 23:40 👍 0 🔁 0 💬 1 📌 0

Continued with that same:

"OK, so I walk there. What do I wash?"

Aaaand it's in a loop again. So it's not about the runtime.

07.03.2026 23:35 👍 0 🔁 0 💬 1 📌 0

Turns out I had left default GGUF runtime selection to Vulkan and now with CUDA 12 q4 didn't loop.

But it also told me I should walk.

07.03.2026 23:32 👍 2 🔁 0 💬 1 📌 0

Pekka Lund

Latest posts by Pekka Lund @pekka