INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    т
    2.53
    2.22
    к
    2.09
    2.09
    కు
    2.03
     sebab
    1.94
    1.90
    至於
    1.85
    보면
    1.84
    десят
    1.79
    POSITIVE LOGITS
    uje
    2.17
    ет
    2.09
    ský
    2.05
    τή
    1.88
    ské
    1.87
    ských
    1.84
    "
    1.82
    ot
    1.78
    1.73
    (\
    1.73
    Act Density 0.003%

    No Known Activations