INDEX
Explanations
Mi, Ni, Bu, Nu, ru prefixes
New Auto-Interp
Negative Logits
апреле
0.41
淤
0.41
ÍC
0.39
혀
0.39
онов
0.39
ಯ
0.39
othiaz
0.38
க்
0.38
⬯
0.38
అవ
0.37
POSITIVE LOGITS
pper
0.77
pping
0.71
pped
0.64
ppy
0.63
ffi
0.63
ff
0.63
xt
0.61
ppers
0.60
ppling
0.59
ppo
0.58
Activations Density 0.039%