Pavel Veselý's Avatar

Pavel Veselý

@pavelvesely

Computer scientist at Charles University, Prague 🇨🇿 I like all kinds of efficient algorithms and data structures for large datasets || also ⛰️🇺🇦https://iuuk.mff.cuni.cz/~vesely/

68
Followers
58
Following
26
Posts
10.12.2024
Joined
Posts Following

Latest posts by Pavel Veselý @pavelvesely

Preview
Postdoctoral Research Associate (Fixed Term) at University of Cambridge Explore an exciting academic career as a Postdoctoral Research Associate (Fixed Term). Don't miss out on other academic jobs. Click to apply and explore more opportunities.

Postdoc position in Cambridge with Julia Wolf:

Julia is a phenomenal researcher and a wonderful collaborator. She is advertising a 2-year postdoc in additive combinatorics and model theory.

Closing date: 16 March. Details: jobs.ac.uk/job/DQP803/postdoctoral-research-associate-fixed-term

25.02.2026 09:51 👍 3 🔁 3 💬 0 📌 0
Post image

Today the winter Olympics open. Russia and Belarus are allowed to participate under neutral flag.

Picture by Gatis Šļūka

06.02.2026 14:27 👍 1087 🔁 463 💬 33 📌 16
Algorithmic Data Privacy, summer semester 24/25, Pavel Hubáček and Pavel Veselý Algorithmic Data Privacy

Dear Gautam, your notes and videos are really great! Indeed I learned a lot from them, and with Pavel Hubáček, we teach DP based on your notes + a couple of other sources: iuuk.mff.cuni.cz/~vesely/vyuk...
Thanks a lot for sharing them!

05.02.2026 17:53 👍 2 🔁 0 💬 1 📌 0
HLi Lab - Vacancies Openings

I am looking for a postdoc to develop high-performance algorithms in computational genomics. Email or DM me if interested. For more information, see hlilab.github.io/vacancies. RTs appreciated!

14.01.2026 15:44 👍 43 🔁 64 💬 1 📌 0

The #dblp computer science bibliography faces a strong demand. But its net budget is shrinking. This is why we humbly ask for your kind support in the form of a donation to Schloss #Dagstuhl LZI.

Learn more or donate here:
www.dagstuhl.de/en/dblp/donate

Thank you very much!

18.12.2025 16:42 👍 4 🔁 4 💬 0 📌 0

Thanks @brinda.eu for the nice sketch of our work with Ondřej Sladký! 👇

08.12.2025 09:57 👍 2 🔁 0 💬 0 📌 0

1/9 Just out:

k-mer indexes are the backbone of fast search in genomic data, but many degrade under small k, subsampling, or high diversity.

With Ondřej Sladký and @pavelvesely.bsky.social we asked: can we build one that works efficiently for any k-mer set?

05.12.2025 17:42 👍 27 🔁 13 💬 1 📌 1
Post image

🧮 Just out in Bioinformatics Advances: “FroM Superstring to Indexing: A space-efficient index for unconstrained k-mer sets using the Masked Burrows-Wheeler Transform (MBWT)” 

Full article available: https://doi.org/10.1093/bioadv/vbaf290 

Authors include: @pavelvesely.bsky.social, @brinda.eu

05.12.2025 10:01 👍 11 🔁 3 💬 1 📌 1

WHAT

04.12.2025 11:35 👍 2 🔁 3 💬 0 📌 0

Optimized k-mer search across millions of bacterial genomes on laptops https://www.biorxiv.org/content/10.1101/2025.11.23.690050v1

26.11.2025 16:47 👍 26 🔁 13 💬 0 📌 1
Post image

Ubohost

Tohle jedno slovo nejlépe vystihuje počin jednoho z nově zvolených ústavních činitelů téhle země

Začít mandát tím, že sundáte ukrajinskou vlajku z budovy pěkně ilustruje to, o co mu jde. Nikoliv o zlepšení téhle země, ale jen o rozdmýchávání vášní

Co s tím? Já si koupil drona

06.11.2025 20:57 👍 98 🔁 7 💬 5 📌 0

Thanks for your interest! Unfortunately, we don't have such an online class, and it'll actually be in Czech.

04.10.2025 08:25 👍 0 🔁 0 💬 0 📌 0
Preview
DOD 2024: data sketching Z GB na kB: jak naskečovat velká data a neztratit při tom hlavu (ani patu) Pavel Veselý

I've already given such a talk last year, starting with some motivation and then talking about cardinality estimation (hiding many details). About 50 students attending, some interacting at the beginning. Slides in Czech (hopefully easy to translate nowadays):
docs.google.com/presentation...

02.10.2025 13:23 👍 0 🔁 0 💬 0 📌 0

During the next two months, I will have two long talks about streaming algorithms / data sketching for high-school students. Did you give a similar talk? What was your experience?

02.10.2025 13:23 👍 1 🔁 0 💬 1 📌 0

Btw, Nick's profile: @nickmatsakis.bsky.social

16.09.2025 21:05 👍 0 🔁 0 💬 0 📌 0

The algorithm gives more than just an estimate for the diameter: Using the stored points, one can sqrt(2)+epsilon-approximate Furthest Neighbor queries or 1.22-approximate the Minimum Enclosing Ball. The approx. ratio for diameter is optimal but it's still open for MEB. Lots of nice open problems!

16.09.2025 21:04 👍 0 🔁 0 💬 0 📌 0

Specifically, for sqrt(2)+epsilon-approximation of diameter, the AS'10 algorithm stores O(1/epsilon^3 log(1/epsilon)) input points, and we managed to shave off one factor of 1/epsilon. Still, we can only prove a lower bound of 1/epsilon, and closing the gap is a nice open problem!

16.09.2025 21:04 👍 0 🔁 0 💬 1 📌 0

The AS'10 algorithm covers points by blurred balls, and this approach overall works. Adding a few ideas, we have circumvented the issue in AS'10, slightly simplified the algorithm and its analysis, and improved the space bounds.

16.09.2025 21:04 👍 0 🔁 0 💬 1 📌 0
Preview
Streaming Algorithms for Extent Problems in High Dimensions - Algorithmica We present (single-pass) streaming algorithms for maintaining extent measures of a stream S of n points in $\mathbb{R} ^{d}$ . We focus on designing streaming algorithms whose working space is polynom...

We in fact simplify a nice paper by Agarwal and Sharathkumar from SODA'10 and Algorithmica '15. Yet, despite that it's published in a decent journal, there appears to be a subtle flaw in the argument, and fixing it probably requires using more space...
link.springer.com/article/10.1...

16.09.2025 21:04 👍 0 🔁 0 💬 1 📌 0
Preview
Streaming Diameter of High-Dimensional Points We improve the space bound for streaming approximation of Diameter but also of Farthest Neighbor queries, Minimum Enclosing Ball and its Coreset, in high-dimensional Euclidean spaces. In particular, o...

Tomorrow at ESA: my former postdoc Nick Matsakis will present our streaming algorithm for diameter in high-dimensional spaces. Very simple: just 4 lines of pseudocode, and yet, achieving optimal approximation. Joint work with Magnús M. Halldórsson. arxiv.org/abs/2505.16720

16.09.2025 20:44 👍 2 🔁 0 💬 2 📌 0
Preview
On 9/16/25, celebrate a date of mathematical beauty Pythagorean Triple Square Day, as one man affectionately calls 9/16/25, is a day like no other this century.

Pythagorean Triple Square Day, as one man affectionately calls 9/16/25, is a day like no other this century.

16.09.2025 11:50 👍 1196 🔁 628 💬 23 📌 111

Zstandard's --long range mode works wonders for assemblies, but needs uninterrupted single line sequences.

*AllTheBacteria 661k, multiline fasta*
gzip (pigz): 751GB
zstandard --long: 641GB (30% original size)

*Single line fasta*
gzip (pigz): 700GB
zstandard --long: 232GB (10% original size)

09.09.2025 10:27 👍 36 🔁 12 💬 2 📌 3
Post image

🌎👩‍🔬 For 15+ years biology has accumulated petabytes (million gigabytes) of🧬DNA sequencing data🧬 from the far reaches of our planet.🦠🍄🌵

Logan now democratizes efficient access to the world’s most comprehensive genetics dataset. Free and open.

doi.org/10.1101/2024...

03.09.2025 08:39 👍 218 🔁 118 💬 3 📌 16

At scale, the way that we store (and process) data matters! Many may think that the way we keep data, the file formats we adopt, and the way that we compress data are unimportant details, but they are, in fact, critical considerations to allow science to move forward at scale!

26.08.2025 14:45 👍 23 🔁 8 💬 1 📌 0
Some thoughts on journals, refereeing, and the P vs NP problem A guest post by Eric Allender prompted by an  (incorrect) P ≠ NP proof   recently published  in Springer Nature's Frontiers of Computer Scie...

Springer publishes a P ≠ NP "proof" and Eric Allender has words to say.

blog.computationalco...

04.08.2025 18:08 👍 43 🔁 15 💬 2 📌 5

A monumental collaborative effort with many incredible people ☺️ Proud to be part of this!

10.06.2025 08:21 👍 9 🔁 6 💬 0 📌 0

Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...

03.06.2025 09:25 👍 44 🔁 24 💬 1 📌 2

Yes! Even a full hard drive with family photos is apparently a useful computational resource.

30.05.2025 11:42 👍 3 🔁 0 💬 0 📌 0
Turnstile majority A famous algorithm of Boyer and Moore for the majority problem finds a majority element in a stream of elements while storing only two values, a single tenta...

Nicely written blog post by David Eppstein on the Boyer–Moore (deterministic) streaming algorithm to find a majority element in a stream, and its extensions, first to the turnstile model, and then to frequency estimation (Misra–Gries).
11011110.github.io/blog/2025/05... via @theory.report

06.05.2025 13:30 👍 18 🔁 3 💬 1 📌 0

Thanks a lot for organizing the workshop! I really enjoyed it!

26.04.2025 15:03 👍 0 🔁 0 💬 0 📌 0