And perversely the more voluminous the documentation the more effort to keep it syncronised with the code, and the more likely it is for inconsistencies to appear.
And perversely the more voluminous the documentation the more effort to keep it syncronised with the code, and the more likely it is for inconsistencies to appear.
Yeah documentation and comments can easily fall into trap of describing the implementation rather than intent. What makes them potentially worse is that thereβs nothing that forces them to stay synchronised with the actual implementation. They can easily end up being misinformation.
Realised that many of my favourite soundtracks to wrangle data to are break-up albums.
I think it's the combination of bitterness, anger, sadness, but also resolve, and maybe a smidge of hope.
Perfect for righting the wrongs of Other People's Data.
#databs #rstats
If you accept that, there's a corollary regarding language choice:
Domain specific languages are good actually! Preferable even.
Provided you can find peers that are fluent, you can communicate a data analysis more clearly and with less effort in a DSL like R, than you can in a more general lang.
Your code can either help or hinder in this.
Code that aids communicates in language of the domain. Surfaces assumptions rather than buries. Makes the inferential connections legible.
Merely running without obvious error is prerequisite for your task, not the goal.
A lot of great answers!
The best shot you have of verifying a data analysis is to convince a diverse jury of your peers that every plank you laid from data collection, to data wrangling, to modelling, to conclusion, forms a logically sound bridge that can hold the weight of your inference.
Former technical writer here.... YES
The bikes-for-transportation community is absolutely charging their cameras to capture video riding past the forthcoming gas station lines.
Absolutely!
π€π€π€
π
Also havenβt checked out the test set yet. Which ep are you channeling?
Yesssss yesssssssss
π―
π―
π―
Thinking about is lately in the context of LLMs, and their propensity to add unnecessary complexity with each iteration of the prompt.
I meant you #databs
Alright class, #rstats #dataabs, a question, and this will be on your exam:
Code quality, in particular readability, and the management of complexity is *much* more important in data analysis than software engineering in general. Why?
Other candidate Is possibly Microsoft Excel.
Came here to ask same. Fear the ANACONDA.
Me: oh maybe Iβll write my second blog post ever about AI!
<reads Daveβs blog>
Oh, no I just need to direct people to Daveβs good post.
Switch your fossil up to minimise the chances of getting the one with the goat behind it.
The human skin toned squiggly asterisk logo for Claude.
Hey Claude
Draw yourself a logo befitting The Suppository of All Wisdom.
This man spending just 0.5K a month on tokens and working only on his nights and weekends has shipped an application every 30 days.*
*Individual results may vary
When we talk about Baby Boomers not getting off the stage: In 1997, the US president was born in 1946.
In 2007, the US president was born in 1946.
In 2017, the US president was born in 1946.
And next year in 2027? The US president will have been born in 1946.
The analogy is π―
Itβs all the pain of raising the small child with none of the reward. Lil Claude only learns as far as his context window allows. After that he resets to the mean opinion of all the text on the internet. π±
Parenting with an extra Sisyphean twist.
Sorry, Bluesky, but I have to say it: The little gremlin in my brain would not stop screaming about AI until I referenced Fran Fine and linked to "Me and You and Everyone We Know." erincikanek.com/the-rumors-o...
πππ Yes!
folks, please don't submit LLM-generated PRs to open source projects. It makes no sense.
If the maintainers want to use an LLM to fix an issue, they can use Claude or whatnot directly. They don't need you as intermediary, that's just silly.
If they don't want to use LLMs, they have reasons.