INDEX
    Explanations

    colons followed by numbers

    New Auto-Interp
    Negative Logits
    :
    -0.30
    -0.22
    :↵
    -0.20
    ãģ¾ãģŁ
    -0.19
    {}
    -0.17
    :↵↵
    -0.17
    (
    -0.16
     latter
    -0.16
    {}\
    -0.15
    ():
    -0.14
    POSITIVE LOGITS
    innen
    0.20
    istrovstvÃŃ
    0.19
    iban
    0.15
    ilir
    0.15
    kv
    0.15
    stav
    0.14
    wik
    0.14
    بÙĪØ§Ø³Ø·Ø©
    0.14
    reffen
    0.13
    á»Ĩ
    0.13
    Act Density 0.549%

    No Known Activations