INDEX
Explanations
names of people or proper nouns
New Auto-Interp
Negative Logits
ntity
-0.15
mdi
-0.14
conflicts
-0.14
irth
-0.14
abez
-0.14
reten
-0.14
Bearer
-0.14
anela
-0.13
yx
-0.13
Publication
-0.13
POSITIVE LOGITS
çĦ¶
0.16
utto
0.15
fre
0.15
iteur
0.15
Raq
0.15
dumb
0.14
athan
0.14
separ
0.14
Mobil
0.14
Chatt
0.13
Activations Density 0.007%