INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ي
    1.84
    ب
    1.41
    т
    1.37
    ت
    1.27
    i
    1.19
    ل
    1.16
    ف
    1.15
    1.15
    ج
    1.13
    كثر
    1.11
    POSITIVE LOGITS
     I
    1.56
     King
    1.41
     king
    1.02
     the
    1.00
     Queen
    0.98
     KING
    0.95
     for
    0.92
    King
    0.92
     by
    0.91
     U
    0.90
    Act Density 0.080%

    No Known Activations