INDEX
    Explanations

    words following emphasized words

    New Auto-Interp
    Negative Logits
    𝗗
    1.13
    ؘ
    1.13
    ছে
    1.05
    𒅗
    1.05
    1.03
     وعلى
    1.01
    piece
    0.98
    𝗙
    0.98
    ية
    0.98
     forerunner
    0.96
    POSITIVE LOGITS
    ש
    1.19
    ed
    1.18
    '
    1.16
    1.13
    ena
    1.06
    1
    1.06
    !,
    1.05
     bzw
    1.05
     niej
    1.05
    ות
    1.04
    Act Density 0.090%

    No Known Activations