Zack Angelo's Avatar

Zack Angelo

@zackangelo

building ai inference @ mixlayer

24
Followers
128
Following
8
Posts
25.10.2024
Joined
Posts Following

Latest posts by Zack Angelo @zackangelo

just realized bsky doesn't support gifs lol

15.12.2024 14:40 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

functions can even compose, here's the model using the output of one as the input into another

13.12.2024 20:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

one of the most slept on capabilities of newer AI models is the ability to call multiple tools in a single shot. here's the newest llama 70b running on mixlayer calling 4 tools (lookup weather in 3 cities and perform some arithmetic)

13.12.2024 20:24 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
LLM Reasoning 101 - Mixlayer Large Language Models (LLMs) can be made better at complex reasoning tasks through techniques like few-shot prompting and Chain of Thought (CoT) reasoning, which allow smaller models to match the perf...

Want to play around with chain of thought and some other prompting techniques? I put up a few
Mixlayer demos on Meta's Llama 3.1 8b in this blog post. www.mixlayer.com/blog/2024-12...

11.12.2024 16:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

weird that the instruction tuned Llama3 8b models are downloaded less than the original?

04.12.2024 15:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I doubt they switch to a lower precision model, but would not be surprised if they start using a quantized or fp8 KV cache. Much easier to switch out dynamically in response to load vs the model weights.

23.11.2024 17:43 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Extending the Context Length to 1M Tokens! API Documentation (Chinese) HuggingFace Demo ModelScope Demo Introduction After the release of Qwen2.5, we heard the community’s demand for processing longer contexts. In recent months, we have made m...

Crazy to think that a 1M token context window will be the norm soon.

Doesn't look like this model has made it onto HF yet (just a space, no weights), curious to learn more about the sparse attention mechanism.

qwenlm.github.io/blog/qwen2.5...

18.11.2024 15:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

woke up in a 3am fit of terror last night bc I dreamt I left an 8x a100 gpu cluster running by accident 🫠

17.11.2024 13:58 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0