INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     folklore
    -0.44
     Citizenship
    -0.42
     civility
    -0.42
     coverage
    -0.39
    mapbox
    -0.38
     Literacy
    -0.38
     Coverage
    -0.38
    Tay
    -0.38
     Passport
    -0.38
    verwijspagina
    -0.38
    POSITIVE LOGITS
     crash
    0.63
     Crash
    0.59
    Crash
    0.56
    Autoritní
    0.53
     autorytatywna
    0.53
     flight
    0.52
    mathbf
    0.50
    Tour
    0.50
    crash
    0.49
     otomatig
    0.49
    Act Density 0.055%

    No Known Activations