INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.29
    1.08
     in
    1.06
    و
    1.04
    0.96
    u
    0.94
    köy
    0.91
    ла
    0.88
    ków
    0.87
    gono
    0.87
    POSITIVE LOGITS
    н
    1.45
    1.34
    ن
    1.21
    1.13
    il
    1.12
    1.12
     for
    1.11
    1.09
     copi
    1.05
    ز
    1.04
    Act Density 1.327%

    No Known Activations