INDEX
    Explanations

    fit, unfit, or medical status

    New Auto-Interp
    Negative Logits
     ദുൽ
    0.41
    0.39
    ónico
    0.38
     एडज
    0.38
     Natalie
    0.38
     inseparable
    0.37
    Natalie
    0.37
    0.37
     clique
    0.36
    Normal
    0.36
    POSITIVE LOGITS
     fitness
    1.42
     fit
    1.39
    Fitness
    1.38
    fitness
    1.36
     unfit
    1.34
     Fitness
    1.32
     Fit
    1.26
    Fit
    1.23
     FITNESS
    1.19
    fit
    1.16
    Act Density 0.012%

    No Known Activations