INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ن
    1.46
    1.36
    م
    1.31
    ،
    1.29
    ні
    1.23
    و
    1.23
    ش
    1.19
    1.12
    اج
    1.09
    re
    1.05
    POSITIVE LOGITS
    ing
    1.53
    u
    1.32
    us
    1.27
    Survey
    1.25
    cuando
    1.24
    IN
    1.23
    ACT
    1.16
    ING
    1.14
    cially
    1.13
    A
    1.13
    Act Density 0.003%

    No Known Activations