Ryan Angilly's Avatar

Ryan Angilly

@angilly

Applied Research @ NVIDIA

32
Followers
34
Following
25
Posts
20.11.2024
Joined
Posts Following

Latest posts by Ryan Angilly @angilly

How’d it do?

10.12.2024 01:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
qwen2.5-coder The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.

Qwen is probably best out there right now: ollama.com/library/qwen...

09.12.2024 02:02 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

If you want a nicer UI, check out OpenWebUI. It presents a nice ChatGPT-esque web UI with history and etc….

09.12.2024 02:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Super excited about PydanticAI. Looking forward to taking it out for a spin.

02.12.2024 16:36 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

My hunch is that they can write machine code right now well enough. I've never seen any evals on it, though.

One thing to consider is portability. Machine code is denser than source code, but I'd bet cross compiling source code to 50 distros is far cheaper from a compute perspective.

02.12.2024 16:38 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

But yeah I guess bottom line, rag can get you far. Won’t know where it breaks until it does unfortunately. I look forward to a world where RAG systems can monitor themselves and signal to a user β€œhey it might be time to do some fine tunings!”

02.12.2024 15:17 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Depends on the use case. If the query is β€œwhat is my most controversial opinion across all my notes?” then rag can easily fall over unless you anticipated it ahead of time in the indexing pipeline. That’s admittedly an extreme example, but the spectrum between that & simple fact retrieval is blurry

02.12.2024 15:12 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Yeah I get what you’re saying. But I’d caution against dismissing people because they don’t speak for _everyone_.

I am an expert πŸ˜‚ and while I trust LLMs for many things, me and most of my friends very much would not trust an LLM machine code output.

02.12.2024 15:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

What is total dataset size in bytes? If complex reasoning across the whole set of notes is required for your use case β€” it could be! β€” RAG will fall over on you.

02.12.2024 13:29 πŸ‘ 2 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

As an aside to the broader goal of the below thread, this is a question so many people have right now: β€œwhen do I start fine tuning?”

I’ve yet to see good answers.

We are still so early!!

02.12.2024 13:27 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Have you done any experiments with your benchmarks going from 1 to 100 examples to see if accuracy regresses?

02.12.2024 13:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
ChatGPT - x86 Hello World Code Shared via ChatGPT

I think it can[1] but we don’t do it because:

1) we don’t trust the LLM enough. We want to review the code.
2) high level languages give you a higher density of expression per token. i.e. it takes less tokens so you get faster answers

[1] chatgpt.com/share/674db3...

02.12.2024 13:20 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

In Context Learning is underrated.

02.12.2024 13:12 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Transcript of Hard Fork ep 111: Yeah. And I could talk for an hour about transformers and why they are so important.
But I think it's important to say that they were inspired by the alien language in the film Arrival, which had just recently come out.
And a group of researchers at Google, one researcher in particular, who was part of that original team, was inspired by watching Arrival and seeing that the aliens in the movie had this language which represented entire sentences with a single symbol. And they thought, hey, what if we did that inside of a neural network? So rather than processing all of the inputs that you would give to one of these systems one word at a time, you could have this thing called an attention mechanism, which paid attention to all of it simultaneously.
That would allow you to process much more information much faster. And that insight sparked the creation of the transformer, which led to all the stuff we see in Al today.

Transcript of Hard Fork ep 111: Yeah. And I could talk for an hour about transformers and why they are so important. But I think it's important to say that they were inspired by the alien language in the film Arrival, which had just recently come out. And a group of researchers at Google, one researcher in particular, who was part of that original team, was inspired by watching Arrival and seeing that the aliens in the movie had this language which represented entire sentences with a single symbol. And they thought, hey, what if we did that inside of a neural network? So rather than processing all of the inputs that you would give to one of these systems one word at a time, you could have this thing called an attention mechanism, which paid attention to all of it simultaneously. That would allow you to process much more information much faster. And that insight sparked the creation of the transformer, which led to all the stuff we see in Al today.

Did you know that attention across the whole input span was inspired by the time-negating alien language in Arrival? Crazy anecdote from the latest Hard Fork podcast (by @kevinroose.com and @caseynewton.bsky.social). HT nwbrownboi on Threads for the lead.

01.12.2024 14:50 πŸ‘ 247 πŸ” 53 πŸ’¬ 19 πŸ“Œ 17

Playing around with @anthropic.com MCP stuff, and found out the hard way that www.claudedesktop.com is not claude.ai/download 😬

Anyway it's working now!

#llm #genai

01.12.2024 05:06 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I work in it so I’m in a bit of a bubble. What are some of the most egregious lies you see?

30.11.2024 19:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Ever wondered how AI autocomplete works? In this thread I’ll walk you through how we work with LLMs at @continuedev.bsky.social to decipher user intent and provide them with useful completions.

Continue is open source so I’ll post links to relevant code on Github at the end of this thread.

30.11.2024 17:06 πŸ‘ 8 πŸ” 3 πŸ’¬ 2 πŸ“Œ 0

Ok very cool.

Do you run any benchmarks against your default prompt templates, and have you published them so others can compare different models or prompt/template tweaks?

30.11.2024 18:42 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Do you fine tune any of your models much or do you just work with prompt templating?

30.11.2024 18:15 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Long story short I think the change is a 10 year horizon. Not 2.

30.11.2024 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Only just recently have the models with long enough context length and recall across context to make retrieval work.

30.11.2024 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

It’s completely transformed how I work: writing code, tests, design docs; less time scouring stackoverflow or fighting with plantuml/mermaid making diagrams. I’m far more productive.

But I’m a special case.

I think the real unlock is going to be agents. This promise still hasn’t been realized.

30.11.2024 18:03 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

How this? Can’t tell if it’s underdone.

28.11.2024 19:09 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

Flowers enjoyed. Very nice flowers.

27.11.2024 15:00 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

πŸ™‹πŸ»β€β™‚οΈ

27.11.2024 14:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

@dannynewman.bsky.social dude what do we do here?

27.11.2024 14:50 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I should probably post something so I don’t look like a noob.

18.11.2024 05:06 πŸ‘ 3 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Ah. Feeds. Got it.

21.11.2024 16:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Joined up and said I was interested in tech, science, programming but Discover feed is none of that!

How do I find my genAI nerds?!

21.11.2024 14:58 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0