INDEX
    Explanations

    references to specific software tools or platforms

    New Auto-Interp
    Negative Logits
    izoph
    -0.71
    outhern
    -0.67
     mathemat
    -0.63
    rera
    -0.63
     tyr
    -0.61
    arnaev
    -0.61
     athlet
    -0.61
    essage
    -0.61
     gobl
    -0.60
     exha
    -0.60
    POSITIVE LOGITS
    ski
    1.01
    ger
    0.98
    ging
    0.93
    gers
    0.88
    fold
    0.87
    gin
    0.87
    ations
    0.85
    ated
    0.85
    sky
    0.84
    icated
    0.83
    Act Density 0.027%

    No Known Activations