INDEX
Explanations
numeric data or statistical information in the text
New Auto-Interp
Negative Logits
azzi
-0.18
azo
-0.17
azzo
-0.16
desar
-0.16
stad
-0.15
алом
-0.15
Nine
-0.14
lette
-0.14
Kumar
-0.14
zhou
-0.14
POSITIVE LOGITS
6
0.27
sixth
0.25
Sixth
0.23
six
0.23
seventh
0.22
åħŃ
0.21
7
0.21
åįģåħŃ
0.21
-six
0.21
åħŃ
0.21
Activations Density 0.203%