Trending

#dataQuality

Latest posts tagged with #dataQuality on Bluesky

Latest Top
Trending

Posts tagged #dataQuality

Biggest source of dirty data? Not old records. New ones from forms with no validation.

"Google" / "Google LLC" / "google" = three records, one company.

Use dropdowns and input masks. Prevent the mess.

#DataQuality #FormDesign

0 0 0 0
Post image Post image Post image Post image

if 95% of organizations are confident in their #AI #document pipelines, why do more than half of those same organizations report frequent quality failures?

Get the full breakdown in our AI Readiness 2025 Addendum: bit.ly/4sBYSmT

#AI #DocumentProcessing #DataQuality

0 0 0 0
Preview
Artificial Intelligence and the Data Quality Problem No One Can Ignore - MedCity News Healthcare is pouring money into AI, but poor data quality is quietly sabotaging results by scaling bias, errors, and mistrust instead of value. Until organizations fix historical data, set accuracy b...

Success for #artificialintelligence in #healthcare still seen to hang with #DataQuality. Until data gets cleaned up more and more consistent, concerns will linger. medcitynews.com/2026/03/arti...

1 0 0 0

Understanding where AI gets its knowledge, how it learns and what biases may be embedded within it is essential for creating ethical, effective and trustworthy AI. taxodiary.com?p=57891 #ExplainableAI #DataQuality #EthicalAI

0 0 0 0
Preview
County assessor pauses annual reassessments, sets 2026 level of assessment at 95% and outlines Tyler conversion Tompkins County's assessor told the committee the office will pause annual reassessments for now, set the 2026 assessment level at 95% of market value, and move to Tyler Technologies' enterprise assessment system with an anticipated July 1, 2027 go-live.

The Tompkins County assessor has hit pause on annual reassessments to tackle a growing backlog while gearing up for a major database upgrade—what does this mean for your property taxes?

Learn more here

#TompkinsCounty #NY #CitizenPortal #DataQuality #PublicAccess #TompkinsCountyAssessments

0 0 0 0

Stages of data maturity:

Beginner: "We have data"
Intermediate: "We have a lot of data"
Advanced: "We have clean data"
Master: "We have governed, unified, analysis-ready data"

Most companies are stuck between 1 and 2.

#DataQuality #DataStrategy

0 0 0 0

Data is only as valuable as its quality. Organizations investing in analytics, AI and decision intelligence need platforms that ensure data is accurate and trustworthy from the start. #DataQuality #DataGovernance

0 0 0 0

As organizations generate more information than ever, gaps are making it harder to trust the data behind decisions. Strong taxonomies and the right technology partners help turn raw information into something reliable. #DataQuality #DataGovernance

0 0 0 0
Preview
Data advisory commission reviews revised research agenda and nine draft recommendations for legislative report The commission reviewed an updated research agenda focused on workforce questions and cross-cutting analysis, discussed data gaps (retention, compensation, family childcare), and heard a proposed report structure and nine recommendations that staff will flesh out before a draft is circulated ahead of April.

The EEC commission is rethinking its research agenda to tackle urgent workforce questions, but what themes are emerging as priorities?

Click to read more!

#MA #CitizenPortal #ChildcarePolicy #DataQuality #WorkforceDevelopment

0 0 0 0

🧠 Cómo la degradación del contexto perjudica los resultados de IA y LLM empresariales

Los datos son clave para el éxito, pero su contexto se corrompe, afectando a la IA.

https://thenewstack.io/context-rot-enterprise-ai-llms/

#LLM #DataQuality #AI #RoxsRoss

0 0 0 0
Preview
The DataOps Way to Data Quality: A Free Book for Every Data Team | DataKitchen Most data quality advice tells you what to measure. This book tells you why your team keeps failing and what to actually do about it.

The DataOps Way to Data Quality: A Free Book for Every Data Team
Most data quality advice tells you what to measure. This book explains why your team keeps failing and what to do about it.
datakitchen.io/the-dataops-...
#databs #dataquality #dataops #free #opensource

1 0 0 0
Preview
Why Your Data Quality Dashboard Isn’t Working And What to Do About It | DataKitchen This article pulls back the curtain on why standard data quality dashboards fall short. We'll reveal six powerful, and perhaps surprising, truths about data quality dashboard failure.

Why Your Data Quality Dashboard Isn’t Working And What to Do About It
We reveal six powerful, and perhaps surprising, truths about data quality dashboard failure.
datakitchen.io/why-your-dat...
#databs #dataquality #opensource

1 0 0 0

"California" / "CA" / "ca" / "Calif."

Same place. Four CRM records. Your regional segment just missed half its audience.

Standardize to 2-letter codes. Make it a dropdown on forms.

#DataQuality #CRM

0 0 0 0
Original post on mastodon.social

I built a tool to find problems hiding in my training data.

LabelLens analyzes labeled text classification datasets for duplicates, mislabels, and class imbalance. Ran it on my own 26K sample dataset — found 5,664 exact duplicates I had no idea about.

Try it […]

1 0 0 0

Don't build a data lake.

Build a clean data lake. 🧹

Data teams spend 60-80% of their time cleaning instead of analyzing.

That's not a productivity issue. That's a structural failure.

#DataQuality #Analytics

0 0 0 0
Preview
We Got Roasted On Reddit For Asking ‘Why Data Engineers Don’t Test?’ | DataKitchen We asked the r/dataengineering community 'Why don't you test?' And we got roasted. Learn why

We asked the Reddit's r/dataengineering community 'Why don't you test?' And we got roasted. Learn why
datakitchen.io/we-got-roast...
#databs #dataquality #dataops

0 0 0 0
Preview
#dataquality #dataengineering #decisionmaking #datagovernance #riskmanagement | Gabriel 🐱 Chandesris "Le coût caché des données de mauvaise qualité (et comment l’éviter)" Saviez-vous qu’une donnée erronée peut coûter des milliers d’euros à votre entreprise ? Voici comment : Décisions erronées : Un r...

Données de mauvaise qualité = décisions erronées, temps perdu, réputation en danger. Solutions : validation, documentation, formation. Pour les décideurs : interrogez la tech !

#DataQuality #DataEngineering #DecisionMaking #DataGovernance #RiskManagement

www.linkedin.com/posts/gabrie...

0 0 0 0
Video

When complexity slows decisions, it’s often a data problem.

Our latest blog l1nq.com/RceWk explores why organizations are adopting DataOps to strengthen data pipelines, improve quality, and embed governance in an AI-driven world.

#DataOps #DataQuality #DataGovernance

1 0 1 0
Post image

#MeettheExperts #DataQuality #TESD

Join us for the final session of our KODAQS Toolbox talks!

Leon Fröhling will present #TESD, a tool for documenting and critically reflecting on online platform datasets.

🗓️ Thursday, 5 March, 2026, 1-2 pm.
📌 Online, register here: www.gesis.org/angebot/wiss...

0 0 0 0
La qualité des données = fondations invisibles de l’entreprise. 3 piliers : validation temps réel, traçabilité, documentation. RH : demandez comment les candidats garantissent cette qualité. #DataQuality #DataEngineering #Workflows #DataEngineer #Travail 

https://www.linkedin.com/posts/gabriel-chandesris_dataquality-dataengineering-dataengineer-ugcPost-7434551715829579776-1ZqD

La qualité des données = fondations invisibles de l’entreprise. 3 piliers : validation temps réel, traçabilité, documentation. RH : demandez comment les candidats garantissent cette qualité. #DataQuality #DataEngineering #Workflows #DataEngineer #Travail https://www.linkedin.com/posts/gabriel-chandesris_dataquality-dataengineering-dataengineer-ugcPost-7434551715829579776-1ZqD

La qualité des données = fondations invisibles de l’entreprise. 3 piliers : validation temps réel, traçabilité, documentation. RH : demandez comment les candidats garantissent cette qualité. #DataQuality #DataEngineering #Workflows #DataEngineer #Travail 

https://www.linkedin.com/posts/gabriel-chandesris_dataquality-dataengineering-dataengineer-ugcPost-7434551715829579776-1ZqD

La qualité des données = fondations invisibles de l’entreprise. 3 piliers : validation temps réel, traçabilité, documentation. RH : demandez comment les candidats garantissent cette qualité. #DataQuality #DataEngineering #Workflows #DataEngineer #Travail https://www.linkedin.com/posts/gabriel-chandesris_dataquality-dataengineering-dataengineer-ugcPost-7434551715829579776-1ZqD

La qualité des données = fondations invisibles de l’entreprise. 3 piliers : validation temps réel, traçabilité, documentation. RH : demandez comment les candidats garantissent cette qualité. #DataQuality #DataEngineering #Workflows #DataEngineer #Travail

www.linkedin.com/posts/gabrie...

0 0 0 0

We wanted to understand the impact of outliers in high-frequency river water quality #monitoring data & find better ways to ensure #dataquality.

Using a 4-year dataset, we evaluated their quant. impact on summary stats & compared diff. detection methods, incl. uni- & multivariate approaches.

2/6

1 0 1 0

We wanted to understand the impact of outliers in high-frequency river water quality #monitoring data & find better ways to ensure #dataquality.

Using a 4-year dataset, we evaluated their quant. impact on summary stats & compared diff. detection methods, incl. uni- & multivariate approaches.

2/6

0 0 1 0
Searchbug - Access Denied

NCPW 2026 runs March 1–7. Use it as a quick fraud readiness check for your business.

Read the playbook: www.searchbug.com/info/natio...

#NCPW2026 #FraudPrevention #ScamAwareness #Cybersecurity #IdentityTheft #DataQuality #RiskManagement #ConsumerProtection

0 0 0 0
Post image

Databricks just showed that clean, deduped data beats fancy model tweaks for faster LLMs. Think your GPU time could be saved with better pipelines? Dive into the findings and rethink your training strategy. #DataQuality #LLMTraining #Databricks

🔗 aidailypost.com/news/databri...

0 0 0 0
Preview
How to Craft a Comprehensive Data Cleanliness Policy In today's data-driven landscape, maintaining data cleanliness is vital for accurate decision-making and organizational compliance.

📊✨ Ready to elevate your data game? Learn how to craft a comprehensive data cleanliness policy! Check out our latest blog post for expert tips! 👉 innovirtuoso.com/data-management/how-to-c... #DataManagement #CleanData #DataQuality

0 0 0 0

AI doesn't understand meaning the way we do. It retrieves what structure allows it to find. If meaning is not preserved before AI touches content, it cannot be recovered later. Data quality and expert design matter more than ever. #DataQuality #AIGovernance #KnowledgeManagement

0 0 0 0
Preview
Evaluation of the Accuracy of Probabilistic Record Linkage Across Sociodemographic Categories in 4 Databases: Exploratory Study Background: Accurate patient record linkage is essential for clinical care, health information exchange, research, and public health surveillance. However, linkage accuracy may vary across demographic groups due to differences in data completeness, quality, and the structural factors underlying how demographic information is captured. Objective: This study aimed to explore whether probabilistic patient matching accuracy varies by age, sex, race, and ethnicity and to identify potential sources of bias that may influence matching performance. Methods: We used 4 Indiana data sources—the Indiana Network for Patient Care, Newborn Screening, Social Security Administration Death Master File, and Marion County Public Health Department—and applied a modified Fellegi-Sunter probabilistic linkage algorithm accommodating missing data under a missing at random assumption. Gold standard match status was established through dual manual review with adjudication. For each dataset, matching sensitivity, positive predictive value, and -scores were estimated and stratified by age, sex, race, and ethnicity. Data completeness, distinct value ratio, and Shannon entropy were assessed to characterize data quality. Ninety-five percent bootstrap CIs were used to assess significance. Results: The algorithm-matching -score was greater than 0.82 for all age strata, ranging from 0.88 to 0.97 for sex, 0.85 to 0.99 for race, and 0.88 to 0.99 for ethnicity. Sensitivity ranged from 0.70 to 0.97 across age strata, 0.76 to 0.97 across sex, 0.85 to 0.99 across race, and 0.85 to 0.989 across ethnicity. Lower sensitivity and -scores were consistently observed in strata with greater missingness or discordance, particularly in Newborn Screening and Social Security Administration Death Master File. Race and ethnicity exhibited the highest missingness and lowest informational diversity, coinciding with the largest declines in accuracy. Shannon entropy and distinct value ratio varied across demographic groups and were strongly associated with performance, indicating that both low and excessively high informational diversity can impair matching. Conclusions: Probabilistic patient matching accuracy is not uniform across demographics and is strongly influenced by data quality and completeness. Although overall matching performance, as assessed by the -score, remained above 0.8, it varied across datasets when stratified by sociodemographic characteristics. Sociodemographic data missingness is associated with lower matching accuracy, raising equity and ethical concerns for clinical, research, and public health applications. Routine demographic-stratified evaluations of matching accuracy, improved standardization of sociodemographic data, and fairness-aware linkage methods are essential to prevent the amplification of structural inequities in linked health datasets.

JMIR Formative Res: Evaluation of the Accuracy of Probabilistic Record Linkage Across Sociodemographic Categories in 4 Databases: Exploratory Study #PatientSafety #HealthData #PublicHealth #DataQuality #RecordLinkage

1 0 0 0
Preview
Building a Data Quality Culture Without Becoming the Data Police You can't guilt people into better data entry. Learn how to build a data quality culture through visibility, smart incentives, and automation.

The "data police" approach to data quality is a trap.

You assign someone to catch mistakes, they become the bad guy, nobody listens, the backlog grows, standards slip anyway.

Culture beats enforcement. Every time.

#DataQuality #RevOps #MarketingOps

0 0 0 0

Wearables are incredibly powerful tools for studying human behavior and wellbeing. But only if we’re clear about what questions they’re actually good at answering!

#Wearables #DigitalHealth #AppleWatch #Measurement #Psychometrics #Wellbeing #HealthResearch #IntensiveLongitudinalData #DataQuality

0 0 0 0
Post image

Great blog from the LexisNexis InterAction+™ Data Quality Services team, "Data Quality: The Unsung Hero to Your Law Firm’s CRM Success - read more now! https://bit.ly/3Zak9Yv

#DataQuality #LegalTech

0 0 0 0