INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    л
    1.39
    ק
    1.34
    ف
    1.30
    지만
    1.23
    IM
    1.21
    1.20
    ER
    1.19
    AND
    1.19
    ح
    1.19
    1.16
    POSITIVE LOGITS
    in
    1.28
    al
    1.16
    right
    1.13
     right
    1.13
    จะ
    1.09
    用の
    1.07
    inę
    1.03
    τα
    1.02
    to
    1.01
    1.00
    Act Density 0.032%

    No Known Activations