INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.28
    ۔
    1.27
    ли
    1.14
    لي
    1.09
    1.05
    ми
    1.03
    1.03
    ди
    1.01
    1.00
     ذریع
    0.98
    POSITIVE LOGITS
     has
    1.45
     was
    1.32
    '
    1.19
     to
    1.11
     java
    1.02
    1.01
     is
    0.99
     P
    0.99
    0.99
     V
    0.97
    Act Density 0.001%

    No Known Activations