The researchers found that seven of the top ten consumer devices and smartphone apps were reliable at tracking steps, and five showed high accuracy when tested in laboratory conditions.
I was somewhat surprised to learn that the less expensive trackers, like the $69.99 Misfit Shine or $59.95 Fitbit Zip, weren't any less reliable than their more pricey counterparts.
"The activity trackers performed better than we expected in general," said Martijn de Groot, one of the authors of the paper. De Groot is also a research director at the Quantified Institute, which specializes in wearable technology. "Only the Nike Fuelband was disappointing," he told me.
The wearables market has exploded in the past few years. The research firm IDC predicts that the number of trackers shipped will jump from around 20 million in 2014 to over 120 million in 2019. Some devices emphasize style and affordability; others offer far more sophisticated data than just step counts, like sleep quality and heart rate.
[Editors' Note: I first met de Groot at the 'Quantified Self' conference in San Francisco, where he presented on a new program to teach nurses and other health professionals about statistics.. More on that here.]
The Moves app for the iPhone and Nike Fuelband were the least accurate and reliable, according to the researchers, while the Jawbone UP, Misfit Shine, Withings Pulse, Fitbit Zip and Lumoback scored high on both counts. These results echo previous studies, which found that the Fitbit and Withings devices were strong performers.
The researchers kicked off the experiment a year ago and recruited more than 50 healthy adult volunteers. What stood out to me about the study is that it's among the first to test the fitness trackers both in the lab and in "free living conditions," meaning in the participants' daily lives. A device might perform well in lab conditions for the purposes of academic studies, but not in the real world.
For the lab portion of the research, the participants were asked to walk twice on a treadmill for 30 minutes while wearing all ten trackers and an ActivPAL, an accelerometer-based monitor worn on the thigh that is commonly used by researchers as the "gold standard."
The participants were subsequently asked to sport all of the devices during a regular workday, but to abstain from cycling, driving or any other activity that might skew the results by damaging the tracker or shifting the wearing position.
By performing statistical analysis to compare the results from the trackers and the gold standard, the researchers were able to assess the validity and reliability of the devices. According to de Groot, the latter is more important for most people. If the feedback is off, but off by about the same amount every day, then it's still useful for workout purposes.
For patients with serious medical conditions, De Groot said he would suggest a device that is worn on the pelvis or chest, rather than on the wrist.
Wrist-worn activity trackers are more prone to errors when we vigorously move our arms during the day, he said. But it also depends on what you're trying to measure: To track lower-limb activity like cycling, for instance, he would suggest a device worn on the ankle.
In an interview, De Groot shared some of the limitations to the study. The researchers did not test for long-distance tracking or other measurements, including sleep. Moreover, they carried out the experiments one year ago. The Moves app, which is the only one that relies on the iPhone's in-built accelerometer, likely improved in the more recently-released iPhone 6 and 6 Plus.
Which wearable tracker do you use? Do you use it regularly, and have you experienced any issues? Share your story with me at email@example.com
Get the best of KQED's science coverage in your inbox weekly.