#Icechunk

@jhamman.bsky.social

8 months ago

Super excited to see #icechunk v1.0 ship today. Stable format, stable API, and ready for production. Take it for a spin and let us know how it goes. 🚀🚀🚀

6 0 0 0

Julius Busecke

@codeandcurrents.bsky.social

10 months ago

Some Postdoc throwback today riding the NJtransit to Princeton.

On my way to the #WCRP km scale hackathon for the week.

www.wcrp-esmo.org/activities/w...

Excited to play around with #healpix #zarr #icechunk and some super high res data.

3 0 0 0

Joe Hamman

@jhamman.bsky.social

10 months ago

I'll be at the CNG conference in Snowbird next week. I wrote a short blog post about what the Earthmover team will be up to.

tldr; we'll be talking about @zarr.dev, #icechunk, @xarray.bsky.social and cloud-native data cubes.

Details in the blog post 👇

2 0 0 0

Joe Hamman

@jhamman.bsky.social

10 months ago

Most people think of @zarr.dev as a "file format". With #Icechunk, we've turned Zarr into a database. @functionth.bsky.social's post shows how Icechunk can be used to solve a problem where transactional databases are often required.

1 0 0 0

Earthmover

@earthmover.io

10 months ago

2/ Could you run a bank ledger on #Icechunk? A multidimensional array store designed for scientific data probably isn’t the first thing that comes to mind for this application... But surprisingly, it totally works!

1 0 1 0

Joe Hamman

@jhamman.bsky.social

10 months ago

@zarr.dev and #icechunk are amazing but they are not magic. They are part of a thoughtfully designed cloud-native data architecture. @tegnicholas.bsky.social peels back the covers on cloud-optimized scientific data formats in our latest "Fundamentals" post 👇

2 1 0 0

Joe Hamman

@jhamman.bsky.social

10 months ago

We found similar results when we first benchmarked #icechunk. Our conclusion: doing IO with a Rust backend is much faster than Python.

👇Really exciting to see @kylebarron.dev's Obstore backend for Zarr-Python ship today.

11 2 0 0

Earthmover

@earthmover.io

11 months ago

4/ 📒 How cloud-optimized formats are structured

🧊 How @zarr.dev and #Icechunk are designed to work efficiently in the cloud

🤑 How this saves you money

0 0 1 0

Joe Hamman

@jhamman.bsky.social

11 months ago

Training AI models at scale from data stored in cloud object storage requires thinking carefully about both bandwidth and concurrency. In this post, @functionth.bsky.social get’s into the details of concurrent reads at scale, showing how #Icechunk and S3 can easily scale beyond 200k requests/second!

1 0 0 0

Earthmover

@earthmover.io

11 months ago

Along the way, he dispels pervasive myths about how S3 prefixes work and the limits that key names impose on scalability. Relevant not just for #Icechunk but any cloud data system (including Apache Iceberg ) which stores data across many objects in object storage.

1 0 0 0

aimeeb

@yayyyimee.bsky.social

11 months ago

I share @rabernat.bsky.social excitement about icechunk!!! On top of delivering 100x performance, it can make impossible tasks possible.

Why am I so excited to endorse #icechunk and #virtualizarr?

bsky.app/profile/eart...

14 3 1 0

Joe Hamman

@jhamman.bsky.social

11 months ago

🚨 New blog post 🚨

In it, we show off our recent work deploying #icechunk on top of #NASA's existing archives of Earth observation data. The results: 100x speed up when extracting time series from existing datasets stored as netCDF.

12 4 0 1

Earthmover

@earthmover.io

11 months ago

2/ In the pilot, we used our new open source tensor storage engine #Icechunk and #VirtualiZarr to present archival NetCDF data stored in S3 as a single analysis-ready cloud-optimized (ARCO) dataset.

2 0 1 0

Joe Hamman

@jhamman.bsky.social

11 months ago

This session is going to be a blast! If you are headed to CNG next month (and you should be!), consider joining us for this workshop on @xarray.bsky.social , @zarr.dev , and #icechunk. 👇👇👇

0 0 0 0

Earthmover

@earthmover.io

1 year ago

Join our webinar, Data Version Control for Arrays with #Icechunk. @rabernat.bsky.social will explain how Icechunk’s transactions, snapshots, tags, + branches can add safety & flexibility to data pipelines and workflows. Register here: bit.ly/3WVPSf2

7 1 0 2

Joe Hamman

@jhamman.bsky.social

1 year ago

The 3.0.0 release clears the way for a bunch of exciting extensions built on top of the v3 spec. #icechunk, variable chunking, new dtypes, and more are all now possible. Time to get busy.

0 0 0 0

Al Merose

@al.merose.com

1 year ago

The parallels between MotherDuck’s ddx storage system and @earthmoverhq.bsky.social’s #icechunk are uncanny, let alone the mission to create cloud-native DBs.

6 0 0 1

Joe Hamman

@jhamman.bsky.social

1 year ago

With this architecture, we showed that we can easily scale a simple OPeNDAP service, sitting on top of an #icechunk repository in S3, to thousands of requests per second. 🚀

1 0 1 0

Joe Hamman

@jhamman.bsky.social

1 year ago

In the talk, I made a few simple points:
- Separation of storage and compute is key to unlocking the scaling potential of cloud
- Cloud optimized data formats are key (example: @zarr.dev and #icechunk)
- API services should be stateless/serverless and should be able to scale horizontally [0->N]

2 1 1 0

Joe Hamman

@jhamman.bsky.social

1 year ago

Seamless Arrays: A Full Stack, Cloud-Native Architecture for Fast, Scalable Data Access Just about everyone agrees on what the ideal Earth system data service would pr...

Also on Thursday afternoon, I'll be giving an invited talk titled "Seamless Arrays: A Full Stack, Cloud-Native Architecture for Fast, Scalable Data Access". It combines all that we've been working on for the past year including @zarr.dev v3, #icechunk, and Xpublish.

agu.confex.com/agu/agu24/me...

3 1 1 0

Joe Hamman

@jhamman.bsky.social

1 year ago

Monday through Thursday, I'll be hanging out with @rabernat.bsky.social at the @earthmoverhq.bsky.social booth in the exhibit hall. Swing by to say hello or to snag some swag/stickers/etc. We'll also be demoing #icechunk all week.

4 0 1 0

Posts tagged #Icechunk