INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     بسي
    1.16
     as
    1.05
     لإ
    1.03
    东西
    1.02
     ي
    0.95
     čís
    0.93
     يد
    0.90
     slumped
    0.89
     á
    0.88
     يا
    0.88
    POSITIVE LOGITS
    at
    1.97
    т
    1.63
    er
    1.49
    in
    1.42
    or
    1.39
    (
    1.39
    t
    1.34
    ot
    1.32
    ات
    1.29
    1.28
    Act Density 0.027%

    No Known Activations