INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    F
    0.68
    おしゃれ
    0.62
     Ove
    0.61
    K
    0.61
     Espí
    0.59
    O
    0.58
     Bình
    0.57
     Ál
    0.57
     Ə
    0.57
     Alek
    0.57
    POSITIVE LOGITS
    i
    1.30
    ي
    1.14
    ت
    1.02
    י
    1.00
    ed
    0.96
    و
    0.93
    n
    0.91
    d
    0.87
    ul
    0.84
    od
    0.78
    Act Density 3.662%

    No Known Activations