Why we maintain a ClickHouse fork at Tinybird (and how it's different)
www.tinybird.co/blog-posts/w...
Why we maintain a ClickHouse fork at Tinybird (and how it's different)
www.tinybird.co/blog-posts/w...
I wanted to play a little bit and tried to do it without sharding, Iโll publish the blog post in the following weeks. This one is playing โeasy modeโ
t.co/DDckrn8cw5
I wrote about how to ingest 1b rows per second with ClickHouse.
The point of the post is not the number, pushing 1b rows is โeasyโ the actual challenge comes when you need to do that in a reliable way, thatโs what I try to explain
python and c++
We use go based tools but not as a language in our product
- We are using Tinybird to handle autoscaling (instead of Prometheus)
- Scaling up ingestion in an OLAP workload
- Our opinionated way for compute-compute separation
- How we optimized ingestion with C++ (not Rust, sorry)
- Cutting costs with Karpenter
www.tinybird.co/blog
I'm fucking tired of clickbaity content, people writing bad content just for SEO or not going into details because "people have a short attention span". Maybe that's true but some people appreciate long-form, detailed and super technical content.
Today we posted 5 high quality engineer posts:
The second part of "handling clickhouse clusters at petabyte scale" www.tinybird.co/blog-posts/w...
We've been operating petabyte-scale cluster for 5 years now, especially dealing with low latency use cases with large amounts of data, some of them under 10ms (for reference, most big data systems latency is 1 second at best).
www.tinybird.co/blog-posts/w...
Thanks Gunnar
Thanks Nico
We need faster feedback loops
failingwithdata.substack.com/p/we-need-a-...
I'd appreciate if you could upvote us in ProductHunt. It's about our latest product iteration
www.producthunt.com/posts/tinybi...
๐
Tinybird Forward is here ยป
Forward is a major evolution of Tinybird, designed to make shipping software with big data requirements faster and more intuitive.
No complex infra project. No context switching. No esoteric architectures. Just code.
(๐ sound on! ๐)
youtu.be/vaSjWu3XFdY
You need developers that care. I think they call it "being accountable".
Today we are releasing a new Tinybird, thanks to every single one spending your time on keeping this running.
curl tinybird.co | sh
Maintenance is not fun. You need people that make it fun, developers that write reports you actually want to read about the most boring fix.
Alerting is tricky and you need to fiddle for weeks until you nail it.
Every new customer makes the development team slower. Maybe it's just 1 hour a month, that's 0.01% of a developer's time in a 40 developer team. Seems low but it adds up.
That small issue you don't have time to fix is blocking someone.
Every single day there is a moment you remember there are thousands of requests coming into your system every second.
You spend 1 month chasing that bug impossible to reproduce.
You fail and you need to explain it to your customers that lose money because of you.
When you're in production, things fail, users complain, oncall alerts wake you up in the middle of the night, incidents happen. Those endless video conferences nobody leaves until they are 100% sure not a single client is still affected.
Production hurts.
If it doesn't, it means you didn't try hard enough.
A new LLM version release with a new super powerful option: --schema
An mini BI example, in this case I'm querying a parquet file with all the events coming from gitlab tickets webhook
Data intensive SaaS products will love to leave your data in icerbeg/s3
Guess who is going to pay all the storage and operation costs? ๐คฃ
You can subscribe and get updates in our blog www.tinybird.co/blog
That's the benefit of having a data engineering team focused 100% on complex data-intensive applications.
And posts like this will keep coming, I'm a little bit tired of "developer marketing" with things that explain the surface and don't get into deep technical details
Weโve got a deep technical post almost ready on real-time log analytics, based on our experience designing these systems with companies handling trillions of rows.
Nop, general available models
You'll need to iterate a few times until you have the data you need but it's better than spending a few hours working on a script.
So it generates data that fits the table schema. But usually, you want the data to have a particular shape, so you can give some instructions to the command:
๐๐ ๐๐๐๐ ๐๐๐๐๐ --๐๐๐๐๐๐ "๐๐๐๐๐ ๐๐๐๐๐๐ ๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐ ๐ธ๐ถ๐ท๐พ"
We are using LLMs in the next Tinybird iterations but we are trying to be subtle, not going with the conversational interface. So for example, when you want to create synthetic data to test your logic you can do
๐๐ ๐๐๐๐ ๐๐๐๐๐