Karl Weinmeister (@kweinmeister)

What you need to know about the Gemini Embedding 2 model Embedding models typically understand one thing at a time. Text goes into one model. Images into another. And then audio requires a…

The Gemini 2 Embedding model is natively multimodal and now in preview.

What's new? How to get started? Migration considerations?

Find out more in my latest post:
medium.com/google-cloud...

11.03.2026 18:50 👍 2 🔁 1 💬 1 📌 0

GitHub - Pavelevich/llm-checker: Advanced CLI tool that scans your hardware and tells you exactly which LLM or sLLM models you can run locally, with full Ollama integration. Advanced CLI tool that scans your hardware and tells you exactly which LLM or sLLM models you can run locally, with full Ollama integration. - Pavelevich/llm-checker

github.com/Pavelevich/l...

04.03.2026 23:45 👍 0 🔁 0 💬 0 📌 1

LLM Checker recommends the best model for your hardware.

It scans 200+ Ollama models for optimal quality, speed, fit, and context.

Here it selects Gemma 3 270M to run my coding task.

04.03.2026 23:45 👍 1 🔁 0 💬 1 📌 0

How to Use the Gemini Deep Research API in Production How many of us have gone down the research rabbit hole? Way too many tabs, links, and notes in the pursuit of knowledge? It’s all useful…

medium.com/google-cloud...

04.03.2026 18:13 👍 0 🔁 0 💬 0 📌 0

🕸️ Webhook triggers: Start research with an HTTP request and check the results later (Cloud Run Service)
📦 Batch tasks: Fan out research topics in parallel and exit when finished (Cloud Run Job)
♾️ Continuous dispatcher: Pull tasks from a queue at up to a 40% compute discount (Cloud Run Worker Pool)

04.03.2026 18:13 👍 0 🔁 0 💬 1 📌 0

Need to automate deep research?

Use the Gemini Interactions API with the right async pattern. 🧵

04.03.2026 18:13 👍 2 🔁 1 💬 1 📌 0

Gemini 3.1 Flash-Lite: Built for intelligence at scale Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.

Big news for fast and cost-efficient AI! Gemini 3.1 Flash-Lite is here:
⚡️ 2.5X faster Time to First Token
📉 $0.25 per 1M input tokens
🧠 Thinking levels for control over reasoning

blog.google/innovation-a...

03.03.2026 17:04 👍 2 🔁 1 💬 0 📌 0

Hemingway-bench AI Writing Leaderboard Stop rewarding slop. Hemingway-bench is an AI writing leaderboard that takes real-world writing tasks and puts them in front of master wordsmiths. Our goal: to push AI writing from two-second vibes…

📝 Blog: surgehq.ai/blog/hemingw...

🏅 Leaderboard: surgehq.ai/leaderboards...

02.03.2026 20:28 👍 1 🔁 0 💬 0 📌 0

Did you know there's a benchmark for writing quality? Hemingway-bench from Surge AI goes beyond "vibes" and robotic checks, to measure coherent and relatable storytelling.

02.03.2026 20:28 👍 1 🔁 0 💬 1 📌 0

How to build GenAI apps for resilience with TypeScript Generative AI features like chat and summarization are table-stakes for modern web apps. But the API calls are resource-intensive and their…

My latest blog post shows how to make your TypeScript GenAI app more robust: medium.com/google-cloud...

02.03.2026 17:36 👍 0 🔁 0 💬 1 📌 0

3 tips to build resilience into your GenAI application:

⏳ Use exponential backoff for API calls, ideally directly in the SDK
🔌 Apply a circuit breaker during instability to prevent cascading failures
🖼️ Use "skeleton components" for loading to improve perceived performance

02.03.2026 17:36 👍 2 🔁 2 💬 1 📌 0

Nano Banana 2: Combining Pro capabilities with lightning-fast speed Our latest image generation model offers advanced world knowledge, production-ready specs, subject consistency and more, all at Flash speed.

blog.google/innovation-a...

27.02.2026 18:35 👍 1 🔁 0 💬 0 📌 0

Watch out for 4 pitfalls when you create infographics.

Nano Banana 2 🍌 can help you out!

27.02.2026 18:35 👍 0 🔁 0 💬 1 📌 0

Agent Skills Management Made Easy I'll show you how to give your AI agent on-demand expertise without burning tokens or context space. You'll learn the secret to packaging knowledge into Agent Skills and how to manage them like a pro…

Check out the 3-min walkthrough of the full lifecycle: www.youtube.com/shorts/UVcMo...

26.02.2026 17:25 👍 0 🔁 0 💬 0 📌 0

Are you using the skills CLI?

My Gemini CLI and Google Antigravity skills are well-organized, thanks to this great tool from Vercel.

26.02.2026 17:25 👍 1 🔁 0 💬 1 📌 0

Serving Qwen 3.5 on Cloud Run with Blackwell GPUs Cloud Run now supports NVIDIA RTX PRO 6000 Blackwell GPUs in preview. With 96GB of GDDR7 VRAM and 1.6 TB/s of memory bandwidth, it’s the…

My latest tutorial shows how to quickly deploy
Qwen3.5-35B-A3B on Cloud Run: medium.com/google-cloud...

25.02.2026 21:05 👍 1 🔁 0 💬 1 📌 0

Want to try out two awesome pieces of tech?

Cloud Run now supports NVIDIA RTX 6000 Pro GPUs with 96GB VRAM and scale-to-zero inference.

And the new Qwen 3.5 multimodal models are achieving outstanding benchmark results.

25.02.2026 21:05 👍 2 🔁 1 💬 1 📌 0

I Taught My AI Coding Agent to Write YouTube Descriptions After producing dozens of videos, I’ve learned a ton. From researching the right topic to editing, each video brings unique challenges…

Read the full article, with a link to the skill: medium.com/google-cloud...

20.02.2026 20:17 👍 0 🔁 0 💬 0 📌 0

How I use AI Agent Skills to Automate Video Production I've made 64 videos. The recording, the demos, the technical deep dives — that's the fun part. But every video also needs a description, timestamps, and hashtags. Same structure every time. So I…

I automated my video production process with an agent skill.

The skill helps to summarize my transcript, read timestamps from a caption file, and validate its own voice and hashtags.

Agent skills are a big time-saver, and not just for coding!

www.youtube.com/shorts/2D1CS...

20.02.2026 20:16 👍 0 🔁 0 💬 1 📌 0

Gemini 3.1 Pro: A smarter model for your most complex tasks 3.1 Pro is designed for tasks where a simple answer isn’t enough.

Read more: blog.google/innovation-a...

19.02.2026 16:14 👍 0 🔁 0 💬 0 📌 0

Reasoning has been supercharged in Gemini 3.1 Pro, and your agents will benefit.

Industry-leading abstract thinking translates into strong performance on agentic workflows and MCP tool-calling.

19.02.2026 16:14 👍 0 🔁 0 💬 1 📌 0

Good addition, thanks!!

18.02.2026 20:00 👍 1 🔁 0 💬 0 📌 0

The Future of AI Agent Communication I explore how AI agents are evolving from standalone systems to an interconnected ecosystem. I'll walk you through the history of web discovery, the current state of Agent Cards and UCP, and my view…

Resources ⚡

Watch the video on YouTube:
www.youtube.com/shorts/j70Yz...

And read the full article on Medium:
medium.com/google-cloud...

18.02.2026 18:29 👍 0 🔁 0 💬 0 📌 0

spec: AI card and AI catalog draft proposal by mindpower · Pull Request #4 · Agent-Card/ai-card Initial Draft of AI Card & AI Catalog Specifications This PR introduces the foundational draft for the AI Card and AI Catalog specifications, creating a unified framework for describing and dis...

Stage 4: AI Catalog 📚
Probing 5 protocols before "hello" is too chatty.
The Catalog provides a single entry point (ai-catalog.json).
One fetch to discover all services and their Unified AI Cards.

PR #4: github.com/Agent-Card/a...

18.02.2026 18:29 👍 0 🔁 0 💬 1 📌 0

SEP-2127: MCP Server Cards - HTTP Server Discovery via .well-known by dsp-ant · Pull Request #2127 · modelcontextprotocol/modelcontextprotocol This SEP proposes adding a standardized discovery mechanism for HTTP-based MCP servers using a .well-known/mcp.json endpoint. Moved from: #1649 Summary This enables clients to automatically discove...

Stage 3: MCP Server Cards 🛠️
Tool use is exploding.
SEP-2127 proposes "Server Cards" for MCP.
Instead of hardcoding tool URLs, agents will discover tools, transports, and auth requirements automatically.

Proposal: github.com/modelcontext...

18.02.2026 18:29 👍 0 🔁 0 💬 2 📌 0

Redirecting

Stage 2: UCP (Commerce) 🛒

The Universal Commerce Protocol lets shopping agents talk to any merchant.
It bundles payments (AP2) so your agent can actually checkout without a custom integration for every site.

Details: ucp.dev

18.02.2026 18:29 👍 0 🔁 0 💬 1 📌 0

Redirecting

Stage 1: A2A Agent Cards 🪪

In 2025, agent-card.json became the first AI-specific entry in that registry.
It’s a JSON contract that tells other agents:
- Here is my name
- Here is what I can do
- Here is how to auth

More info: a2a-protocol.org

18.02.2026 18:29 👍 0 🔁 0 💬 1 📌 0

The Past: Web Standards 🌐

The web solved discovery decades ago with /.well-known/.
Think robots.txt or openid-configuration.
Put a machine-readable file at a predictable URL.
Simple. Effective.

18.02.2026 18:29 👍 1 🔁 0 💬 2 📌 0

What are the trends shaping agent discoverability and interoperability? 🧵

A2A and UCP have laid the groundwork for agent communication and commerce.

Let's walk through new proposals being discussed in the AI community, and how they could help.

18.02.2026 18:29 👍 1 🔁 0 💬 1 📌 0

The connection to strawberry makes sense. It’s just the latest gap that bubbled up.

16.02.2026 18:06 👍 2 🔁 0 💬 1 📌 0

Karl Weinmeister

Latest posts by Karl Weinmeister @kweinmeister