INDEX
    Explanations

    numbers preceding words

    New Auto-Interp
    Negative Logits
    ش
    2.86
    2.36
    თვის
    2.09
    2.08
     sveta
    2.06
    2.03
    ra
    1.98
    dehyde
    1.96
    ্লাহ
    1.95
    sz
    1.94
    POSITIVE LOGITS
    ϲ
    2.28
    fifths
    2.03
    жды
    1.67
     ascertaining
    1.63
    ão
    1.61
    1.60
    1.59
    T
    1.58
    л
    1.58
    ing
    1.54
    Act Density 0.171%

    No Known Activations