INDEX
Explanations
proper nouns, particularly names and places associated with individuals or entities
New Auto-Interp
Negative Logits
rene
-0.19
re
-0.19
res
-0.19
resp
-0.18
rie
-0.18
ri
-0.17
reg
-0.17
da
-0.17
war
-0.16
uffy
-0.16
POSITIVE LOGITS
anou
0.19
ignty
0.18
Ïħνα
0.18
bral
0.17
amental
0.17
ults
0.16
ivers
0.16
Ïħν
0.16
olas
0.15
lease
0.15
Activations Density 0.015%