INDEX
    Explanations

    foreign negative/question particles

    New Auto-Interp
    Negative Logits
    p
    1.20
    i
    1.09
    re
    1.08
    m
    1.08
    ro
    1.07
    ى
    1.02
    a
    0.99
    n
    0.98
    یل
    0.97
    "
    0.97
    POSITIVE LOGITS
    𝟎
    1.13
    ۰
    1.05
    কে
    1.04
     on
    1.01
    0
    0.99
    _{
    0.95
    0.93
     an
    0.93
    িত
    0.91
    0.91
    Act Density 0.007%

    No Known Activations