INDEX
    Explanations

    foreign language characters and numbers like 1, 2

    New Auto-Interp
    Negative Logits
    それから
    -1.20
     antik
    -1.10
    czych
    -1.04
     about
    -1.00
     Antik
    -0.99
     maybe
    -0.98
    そして
    -0.98
     căn
    -0.98
    もしか
    -0.98
     kunjungan
    -0.97
    POSITIVE LOGITS
     ДЕ
    1.02
     przód
    1.00
     ПРИ
    1.00
    }},\
    0.99
    0.98
     ktorá
    0.97
    मेंट
    0.97
    にします
    0.95
     düşük
    0.95
    เม
    0.93
    Act Density 0.142%

    No Known Activations