Marco Slot's Avatar

Marco Slot

@marcoslot.com

Mostly posts about PostgreSQL, Snowflake Postgres, and PostgreSQL extensions. Formerly Crunchy Data, Microsoft, Citus Data, AWS, TCD, VU

1,097
Followers
356
Following
95
Posts
17.10.2024
Joined
Posts Following

Latest posts by Marco Slot @marcoslot.com

Preview
Postgres for the lakehouse: pg_lake, Tue, Jan 13, 2026, 12:00 PM | Meetup Marco Slot will be here this month to talk about the new extension pg_lake that connects Postgres to lakehouse and object storage - for Iceberg, Parquet, csv and more. As

Next Tuesday @marcoslot.com will be at Postgres Meetup for * to talk about pg_lake - #Postgres for #Iceberg with #DuckDB.

Join us!

www.meetup.com/postgres-mee...

09.01.2026 15:02 πŸ‘ 5 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Docs: github.com/Snowflake-La...

04.11.2025 16:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

pg_lake just went open source! (Apache 2.0)

pg_lake is a set of extensions (from Crunchy Data Warehouse) that add comprehensive Iceberg support and data lake access to Postgres, with @duckdb.org transparently integrated into the query engine.

Announcement blog: www.snowflake.com/en/engineeri...

04.11.2025 16:04 πŸ‘ 27 πŸ” 3 πŸ’¬ 1 πŸ“Œ 3
Safety vs. Flexibility quadchart for different DBMSs. VLDB 2025
https://doi.org/10.14778/3725688.3725719

Safety vs. Flexibility quadchart for different DBMSs. VLDB 2025 https://doi.org/10.14778/3725688.3725719

No system hits the sweet spot of allowing for extensibility while maintaining systems safety. It would be nice if there was a standard plugin API (think POSIX) that allows compatibility across systems.

Thanks to @marcoslot.com + @daveandersen.bsky.social for their collaboration on this project

03.07.2025 19:02 πŸ‘ 10 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

At last @abigalekim.bsky.social's paper is out! Its the most complete eval of DB extensions/plugins ever. We analyze PostgreSQL, MySQL, MariaDB, SQLite, DuckDB, Redis.
TLDR: Postgres extns ecosystem is fraught with footguns. Other DBMSs have fewer extns but less problems. DuckDB has cleanest API.

03.07.2025 19:02 πŸ‘ 67 πŸ” 12 πŸ’¬ 1 πŸ“Œ 2
Preview
Crunchy Data Joins Snowflake | Crunchy Data Blog We are excited to announce that Crunchy Data is joining Snowflake to bring Postgres to the AI Data Cloud.

Five years ago I joined @crunchydata.com, shortly after I wrote about having unfinished business with Postgres. Today as part of Snowflake that journey is continuing. We've built some amazing things, but are just getting started.

www.crunchydata.com/blog/crunchy...

02.06.2025 20:44 πŸ‘ 31 πŸ” 5 πŸ’¬ 6 πŸ“Œ 2
Converging Database Architectures  DuckDB in PostgreSQL
Converging Database Architectures DuckDB in PostgreSQL YouTube video by Data Council

Recording of my Data Council talk:
www.youtube.com/watch?v=HZAr...

29.05.2025 21:18 πŸ‘ 15 πŸ” 2 πŸ’¬ 0 πŸ“Œ 1
Video thumbnail

Generative AI comes up with details that would be hilarious, if it wasn't so mind boggling that it can come up with these details.

04.05.2025 21:33 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Thanks Gunnar!

We generate regular position delete files for merge-on-read, so any Iceberg query engine can read them. Equality deletes would be more CDC friendly, but not supported in most engines.

We have some secret sauce around how we track/know positions, but being Postgres helps a lot there.

22.04.2025 16:52 πŸ‘ 5 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

And there it is: Native logical replication from any Postgres server to Iceberg managed by Crunchy Data Warehouse.

Speed up Postgres analytical queries 100x with 2 commands.

22.04.2025 14:48 πŸ‘ 22 πŸ” 2 πŸ’¬ 2 πŸ“Œ 0
Building a Postgres Data Warehouse with Iceberg
Building a Postgres Data Warehouse with Iceberg YouTube video by Apache Icebergβ„’ Meetup

I gave a talk at the inaugural (and awesome) European Iceberg meetup in Amsterdam last night.

It's an introduction to how and why we used Iceberg and DuckDB to build a Postgres Data Warehouse:
www.youtube.com/watch?v=cEnq...

03.04.2025 21:58 πŸ‘ 7 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Move fast and build solid solutions that work across platforms.

You can now use Postgres as a modern Data Warehouse anywhere, using any S3-compatible storage API. Query, import, or export files in your data lake or store data in Iceberg with automatic maintenance and very fast queries.

01.04.2025 17:10 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Crunchy Data Warehouse: Postgres with Iceberg Available for Kubernetes and On-premises | Crunchy Data Blog Crunchy Data brings Postgres-native Apache Iceberg to Kubernetes and on-prem workloads.

Excited to announce Crunchy Data Warehouse is now available for Kubernetes and On-premises. Need faster analytics from Postgres? Want a native Postgres data lake experience? Learn more about how it works: www.crunchydata.com/blog/crunchy...

01.04.2025 15:56 πŸ‘ 6 πŸ” 1 πŸ’¬ 0 πŸ“Œ 2
Post image

Amazing result

28.03.2025 08:10 πŸ‘ 69 πŸ” 8 πŸ’¬ 3 πŸ“Œ 2

Would be cool if Iceberg/Parquet had support for storing JSON as vectors.

26.03.2025 21:26 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Generally by unwinding the JSON in the insert..select that processes the raw log files. JSON is broken down into columns by default, though nested JSON remains as jsonb. You can either store that directly (stored as string), unwind it manually, or convert to a composite type (stored as struct)

26.03.2025 21:25 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We weren't really thinking of log management as a target use case, but Iceberg is ideal as the final destination for logs, and having transactions & built-in job scheduling & a fast query engine (& laser focus on developer experience) makes things really simple and cost-effective.

26.03.2025 19:39 πŸ‘ 7 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Preview
Reducing Cloud Spend: Migrating Logs from CloudWatch to Iceberg with Postgres | Crunchy Data Blog How we migrated our internal logging for our database as a service, Crunchy Bridge, from CloudWatch to S3 with Iceberg and Postgres. The result was simplified logging management, better access with SQ...

I got a number of questions on how we saved $30k a month on cloudwatch by moving logs directly to S3/Iceberg with Postgres so I wrote up how in a bit more detail - www.crunchydata.com/blog/reducin...

26.03.2025 17:59 πŸ‘ 7 πŸ” 2 πŸ’¬ 0 πŸ“Œ 1
Preview
Automatic Iceberg Maintenance Within Postgres | Crunchy Data Blog Iceberg can create orphan files during snapshot changes or transaction rollbacks. Crunchy Data Warehouse automatically cleans up the orphan files using a new autovacuum feature.

Excited to announce built-in maintenance for Iceberg via Postgres.

Now within Crunchy Data Warehouse we will automatically vacuum and continuously optimize your Iceberg data by compacting and cleaning up files.

Dig into the details of how this works www.crunchydata.com/blog/automat...

20.03.2025 15:46 πŸ‘ 12 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Imagine your potential customer as a serious company doing serious things, and willing to pay serious money if you can genuinely help them run their business without causing lot of new problems.

Then go build products for that customer.

This works.

14.03.2025 20:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Auto-vacuum for #Iceberg tables is now available in Crunchy Data Warehouse!

We're always aiming for a 0-touch experience where possible, so we went out of our way to make Iceberg compaction & cleanup fully automatic without any configuration.

Still pretty interesting to see a manual vacuum:

11.03.2025 14:56 πŸ‘ 6 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Video thumbnail

A big part of building Crunchy Data Warehouse was ease of use. How easy is it to load data from existing public datasets?

Step 1: Point at your dataset and we'll load it for you
Step 2: Query it
Step 3: Profit

27.02.2025 18:39 πŸ‘ 5 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

ChatGPT Plus had a good run, but looks like Le Chat is going to be my main assistant now.

I like that it's fast, to the point, and quite clever.

I was impressed with a SQL query it came up with today for finding contiguous ranges of integers. ChatGPT's version was 3x slower.

14.02.2025 12:26 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Video thumbnail

Postgres is increasingly becoming a versatile data platform, instead of just an operational database.

Using pg_parquet you can trivially export data to S3, and using Crunchy Data Warehouse you can just as easily query or import Parquet files from PostgreSQL.

07.02.2025 11:11 πŸ‘ 10 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Post image

Deepseek R1 in an ollama "container app" on a managed Postgres server, because... why not?

28.01.2025 15:50 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

5 years from now, no one's going to want slower, less reliable, or harder to use databases.

27.01.2025 23:24 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - microsoft/documentdb: DocumentDB offers a native implementation of document-oriented NoSQL database, enabling seamless CRUD operations on BSON data types within a PostgreSQL framework. DocumentDB offers a native implementation of document-oriented NoSQL database, enabling seamless CRUD operations on BSON data types within a PostgreSQL framework. - microsoft/documentdb

πŸŽ‰ pg_documentdb is open source

I created the initial version with Vinod Sridharan (an absolutely brilliant engineer) at Microsoft a few years ago and it's come a long way since.

It reimplements Mongo API with exact semantics in PostgreSQL. Already used by FerretDB!

github.com/microsoft/do...

23.01.2025 19:58 πŸ‘ 46 πŸ” 16 πŸ’¬ 0 πŸ“Œ 1

Impressed by the latest ParadeDB release.

Solving the right problems in the right way is really hard.

17.01.2025 21:26 πŸ‘ 7 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

1/11. ParadeDB is now integrated with Postgres block storage. As far as we know, no one has integrated a search and analytics engine with Postgres storage before. This is a big deal.

Here's why we did it, how we did it, and why you should care. 🧡

17.01.2025 19:11 πŸ‘ 30 πŸ” 7 πŸ’¬ 3 πŸ“Œ 1
Preview
Postgres Tuning & Performance for Analytics Data | Crunchy Data Blog Karen digs into Postgres strategies for working with large analytical data sets. She reviews tuning, strategies for pre-compiling data, and other analytics systems.

A lot of great recommendations on tuning PostgreSQL for analytical queries by @karenhjex.bsky.social

www.crunchydata.com/blog/postgre...

09.01.2025 19:37 πŸ‘ 13 πŸ” 4 πŸ’¬ 0 πŸ“Œ 0