INDEX
    Explanations

    proper names and titles

    New Auto-Interp
    Negative Logits
     our
    -1.84
     (
    -1.20
     but
    -1.13
     like
    -1.09
     one
    -1.03
     we
    -1.02
     then
    -1.01
    āju
    -0.97
     any
    -0.94
     that
    -0.94
    POSITIVE LOGITS
    1.26
    Válasz
    1.23
     eventuell
    1.16
    1.15
    Hogyan
    1.11
    ѝ
    1.11
     hakkında
    1.08
    Hozzá
    1.06
    flä
    1.05
     Cannot
    1.05
    Act Density 0.001%

    No Known Activations