INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ker
    0.87
    ff
    0.83
    fiction
    0.82
    ki
    0.80
    ika
    0.78
    jar
    0.78
    ken
    0.77
    ca
    0.77
    0.77
    ky
    0.76
    POSITIVE LOGITS
    ي
    1.63
    يته
    1.27
     fragment
    1.20
    تين
    1.20
    ل
    1.20
    ر
    1.20
    ל
    1.16
    يلي
    1.16
    ין
    1.14
     fragments
    1.11
    Act Density 0.009%

    No Known Activations