INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0
    1.76
    i
    1.70
    ي
    1.61
     a
    1.53
    a
    1.41
    ك
    1.34
    1
    1.30
    it
    1.27
    י
    1.24
    P
    1.23
    POSITIVE LOGITS
    ۲
    0.94
    0.90
     ऐसा
    0.88
     extraño
    0.88
    ಬ್ಬಿಣ
    0.88
     weird
    0.87
    AY
    0.85
     odd
    0.80
    奇怪
    0.79
     کھیلو
    0.78
    Act Density 0.017%

    No Known Activations