INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ي
    1.42
    י
    1.33
    ו
    1.05
    1.05
    i
    0.99
    ب
    0.96
    ف
    0.96
    ك
    0.96
    0.96
    0.93
    POSITIVE LOGITS
    0.83
     Fairy
    0.76
    wd
    0.70
    ্ড
    0.70
    uk
    0.64
    RI
    0.64
     to
    0.64
     fairy
    0.64
    BY
    0.63
    MAN
    0.63
    Act Density 0.001%

    No Known Activations