Error-riddled data sets are warping our sense of how good AI really is

**C C** · Apr 9, 2021 06:54 AM

https://www.technologyreview.com/2021/04...-progress/

EXCERPTS: The 10 most cited AI data sets are riddled with label errors, according to a new study out of MIT, and it’s distorting our understanding of the field’s progress.

Data backbone: Data sets are the backbone of AI research, but some are more critical than others. There are a core set of them that researchers use to evaluate machine-learning models as a way to track how AI capabilities are advancing over time. [...] In recent years, studies have found that these data sets can contain serious flaws.

[...] Now what? Northcutt encourages the AI field to create cleaner data sets for evaluating models and tracking the field’s progress. He also recommends that researchers improve their data hygiene when working with their own data. Otherwise, he says, “if you have a noisy data set and a bunch of models you’re trying out, and you’re going to deploy them in the real world,” you could end up selecting the wrong model. To this end, he open-sourced the code he used in his study for correcting label errors, which he says is already in use at a few major tech companies... (MORE - detals)

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Scientists teach a robot when to have a sense of humor + Maillardet's Automaton	C C	0	2,492	Sep 15, 2022 05:54 AM Last Post: C C
	It's hard to give computers common sense	Leigha	1	2,665	Aug 19, 2021 07:16 AM Last Post: stryder
	Virtual reality warps your sense of time	C C	0	2,483	May 15, 2021 11:17 PM Last Post: C C
	Linking sense of touch to facial movement inches robots toward ‘feeling’ pain	C C	1	2,659	Feb 19, 2020 11:46 PM Last Post: Syne
	Can robots ever have a true sense of self? Scientists are making progress	C C	0	2,642	Feb 28, 2019 08:32 PM Last Post: C C
	Mixing robots into smaller factories makes sense like never B4 + Robot epigenetics	C C	0	2,575	Apr 6, 2017 03:18 AM Last Post: C C