INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .
    -0.43
     (
    -0.42
    ,
    -0.39
    !
    -0.39
    either
    -0.33
    Spoljašnje
    -0.33
    -0.33
     auf
    -0.33
    ↵↵↵↵
    -0.32
    Either
    -0.32
    POSITIVE LOGITS
     Paglinawan
    1.01
     itſelf
    0.88
     myſelf
    0.85
     ſeveral
    0.84
     fevere
    0.84
     tfsi
    0.84
     faſt
    0.84
     Theſe
    0.84
     uſed
    0.84
     Efq
    0.83
    Act Density 0.019%

    No Known Activations