INDEX
    Explanations

    exceeded expectations, won

    New Auto-Interp
    Negative Logits
     usefulness
    0.34
     동작
    0.34
     chromatography
    0.33
     Dressing
    0.33
     నో
    0.33
     cryptographic
    0.33
     usability
    0.32
     Watering
    0.32
     playroom
    0.32
     tortilla
    0.32
    POSITIVE LOGITS
     dominate
    0.69
     dominates
    0.58
     dominated
    0.50
     dominating
    0.49
     scored
    0.49
     overcame
    0.48
     narrowly
    0.48
    拿下
    0.48
     topped
    0.47
     garnered
    0.46
    Act Density 0.008%

    No Known Activations