INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     کی۔
    0.22
     فونبټ
    0.20
     tačiau
    0.20
     بالټ
    0.19
    \}.
    0.19
     ).
    0.19
     ولكن
    0.19
    0.18
     Tetapi
    0.18
     экран
    0.18
    POSITIVE LOGITS
     =
    0.42
    =
    0.39
     is
    0.29
     was
    0.28
    ="
    0.26
    =\
    0.25
    return
    0.25
    =(
    0.25
     :=
    0.25
     being
    0.25
    Act Density 0.370%

    No Known Activations