Rankings

by curiosityiii

Did anyone else notice that as the pass increases and there are more classifications, the rankings are getting more difficult?

Posted February 23, 2015 2:49 AM

by Quia

I only came back to the project last month so I didn't see the kinds of pairings being presented in passes 1-3, so I can't confirm. It's quite possible that you're right, though! Here's a quote from the discussion linked to earlier... It would be super interesting to hear a followup from @parrish on what's actually being used on this much larger set of data.

In terms of optimizing for the least number of required classifications, that's something we're still working on.

This system keeps track of scores in real time, but the parameters were only optimized against the smaller subset of data from the project beta.

After we have a pretty good volume of classifications on this dataset, I'll go back and look for ways to gain the most information per click.

Some of the optimizations we're testing:

Initial score seeds. We have a fair amount of information about each image. Starting the images with a "best-guess" score can cause them to require fewer classifications to reach a stable score.

Early "retirement." We may be able to stop comparing images that have stable scores. That's a bit tricky though as removing images changes the population which in turn will cause the relative score distribution to change.

User weighting. Not something we often talk about, but some volunteers are better at classifying certain types of data. For instance, if we can spot that a person is really accurate when comparing two images close to the limb, then we can add a weight to their classification. Likewise, if your cat starts playing with your mouse, we can fix that too 😃

Nonrandom selection. I tested a few methods in simulations without much luck, but it's worth revisiting with real data from the project. Potentially, focusing more classifications on images with high standard deviations will be worthwhile.

As Paul mentioned, the next dataset we're going to be ranking will be much larger, so we'll need to optimize wherever we can.

http://talk.sunspotter.org/#/boards/BSZ0000001/discussions/DSZ000000b

Posted February 24, 2015 10:26 AM

by Panasko

I am wondering how many rankings should be done for project to complete, at least this phase? Is there any number or is it like ''Oh... I guess we have enough classificatons''?
😉

Posted April 26, 2015 11:23 AM

by Quia

The first, small dataset, that @parrish is talking about in the post I quoted, completed 50 passes. I don't think we're trying to aim that high this time, but I don't know what optimizations of the ones he mentioned we're using now.

Posted May 1, 2015 6:48 PM

by Panasko

What i meant is how many classifications are in total required, do we have estimate or exact figure

Posted May 16, 2015 6:10 PM