Adi Mukherjee's Avatar

Adi Mukherjee

@adim.in

Software Engineer, currently on sabbatical in Japan. Prev: Apple SRE. Working on something new.

41
Followers
53
Following
10
Posts
30.11.2024
Joined
Posts Following

Latest posts by Adi Mukherjee @adim.in

fuckin cool

09.01.2025 04:54 πŸ‘ 562 πŸ” 58 πŸ’¬ 3 πŸ“Œ 0
Preview
AI Semiconductor Landscape feat. Dylan Patel | BG2 w/ Bill Gurley & Brad Gerstner AI Semiconductor Landscape feat. Dylan Patel | BG2 w/ Bill Gurley & Brad Gerstner

Really informative episode with SemiAnalysis’ Dylan Patel: share.snipd.com/episode/add3...

08.01.2025 05:42 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
I made maps that show time instead of space
I made maps that show time instead of space YouTube video by VΓ‘clav Volhejn

Interesting video about building isochromic maps: youtu.be/rC2VQ-oyDG0?...

02.01.2025 01:10 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

All of Randall Munroe's books are GOAT for kids' non-fiction.

26.12.2024 17:10 πŸ‘ 63 πŸ” 4 πŸ’¬ 3 πŸ“Œ 1

Great blog covering the progress this year.
β€œAsking o1 to complete proofs in creative ways is effectively asking it to be a research colleague. The model doesn't have to get proofs right to be useful, it just has to help us be better researchers.”
Good example of utility that evals fail to capture.

25.12.2024 01:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image

Benchmarks are flawed but a way to trace AI over the last year is GPQA Diamond. This is a Google-proof question set that experts get 81% right in their fields & highly skilled non-experts with 30 minutes per question and Google use get 22%

GPT-4 got 37% at the start of 2024. o1 got 78%. o3 is 87.7%

24.12.2024 10:58 πŸ‘ 75 πŸ” 4 πŸ’¬ 2 πŸ“Œ 3
Preview
The Model Context Protocol: Simplifying Building AI apps with Anthropic Claude Desktop and Docker Discover how the Model Context Protocol (MCP) simplifies building AI applications by seamlessly integrating Anthropic Claude with Docker Desktop, enhancing developer productivity and workflow efficien...

Tools for your LLM in containers? Yes please! www.docker.com/blog/the-mod...

24.12.2024 11:03 πŸ‘ 3 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0

I wish people would post more links to interesting things

I feel like Twitter and LinkedIn and Instagram and TikTok have pushed a lot of people out of the habit of doing that, by penalizing shared links in the various "algorithms"

Bluesky doesn't have that misfeature, thankfully!

22.12.2024 00:40 πŸ‘ 905 πŸ” 111 πŸ’¬ 45 πŸ“Œ 26
Post image

I love this idea, thanks for sharing! Btw, in case you revise these, I noticed a typo

24.12.2024 03:51 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Comparing NotebookLM audio overviews to @elevenlabsio.bsky.social’s GenFM podcasts: I’m still blown away by the naturalness of NotebookLM’s conversation, but prefer GenFM’s level of detail, even though it’s a more stilted conversation

22.12.2024 09:29 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This shift from training to inference compute is good news for hyperscalers and Nvidia.

22.12.2024 07:32 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In the ARC AGI eval (linked article in the first post), the β€˜high compute’ mode results came from spending ~$350K in total on inference, giving the model more compute to search the solution tree.

22.12.2024 07:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

These models excel at reasoning-heavy tasks like coding, summarisation, and can work through PhD-level problems with sufficient test time compute. Unlike their predecessors (4o/3.5-sonnet), these reasoning models get β€˜smarter’ with inference compute.

22.12.2024 07:25 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
OpenAI o3 Breakthrough High Score on ARC-AGI-Pub OpenAI o3 scores 75.7% on ARC-AGI public leaderboard.

OpenAI released its 2nd gen reasoning model, o3 (yeah, even they admitted they suck at names).
The evals are perhaps the final nail in the coffin for the scaling wall hypothesis, showing that AI models aren’t hitting a plateau in capabilities.
arcprize.org/blog/oai-o3-...

22.12.2024 07:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
ElevenLabs β€” Introducing the ElevenLabs Reader App | ElevenLabs The ElevenLabs Reader App lets you listen to any text content, with ElevenLabs voices, on the go

Lots of apps have had text-to-speech for years, but ElevenLabs voices really stand out to me for naturalness of enunciation. I use it a lot for listening to articles.
elevenlabs.io/blog/introdu...

22.12.2024 07:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0