Sunspotter Talk

Rankings

  • curiosityiii by curiosityiii

    Did anyone else notice that as the pass increases and there are more classifications, the rankings are getting more difficult?

    Posted

  • Quia by Quia

    I only came back to the project last month so I didn't see the kinds of pairings being presented in passes 1-3, so I can't confirm. It's quite possible that you're right, though! Here's a quote from the discussion linked to earlier... It would be super interesting to hear a followup from @parrish on what's actually being used on this much larger set of data.

    In terms of optimizing for the least number of required classifications, that's something we're still working on.

    This system keeps track of scores in real time, but the parameters were only optimized against the smaller subset of data from the project beta.

    After we have a pretty good volume of classifications on this dataset, I'll go back and look for ways to gain the most information per click.

    Some of the optimizations we're testing:

    • Initial score seeds. We have a fair amount of information about each image. Starting the images with a "best-guess" score can cause them to require fewer classifications to reach a stable score.
    • Early "retirement." We may be able to stop comparing images that have stable scores. That's a bit tricky though as removing images changes the population which in turn will cause the relative score distribution to change.
    • User weighting. Not something we often talk about, but some volunteers are better at classifying certain types of data. For instance, if we can spot that a person is really accurate when comparing two images close to the limb, then we can add a weight to their classification. Likewise, if your cat starts playing with your mouse, we can fix that too 😃
    • Nonrandom selection. I tested a few methods in simulations without much luck, but it's worth revisiting with real data from the project. Potentially, focusing more classifications on images with high standard deviations will be worthwhile.

    As Paul mentioned, the next dataset we're going to be ranking will be much larger, so we'll need to optimize wherever we can.

    http://talk.sunspotter.org/#/boards/BSZ0000001/discussions/DSZ000000b

    Posted

  • Panasko by Panasko

    I am wondering how many rankings should be done for project to complete, at least this phase? Is there any number or is it like ''Oh... I guess we have enough classificatons''?
    😉

    Posted

  • Quia by Quia

    The first, small dataset, that @parrish is talking about in the post I quoted, completed 50 passes. I don't think we're trying to aim that high this time, but I don't know what optimizations of the ones he mentioned we're using now.

    Posted

  • Panasko by Panasko

    What i meant is how many classifications are in total required, do we have estimate or exact figure

    Posted