INDEX
Explanations
names of people, potentially associated with entertainment or news media
proper nouns, specifically names of individuals or characters
New Auto-Interp
Negative Logits
aterasu
-0.67
_-
-0.66
;;;;;;;;;;;;
-0.59
Grayson
-0.58
wipes
-0.58
âĶĢâĶĢ
-0.57
PsyNetMessage
-0.55
---------
-0.53
defeats
-0.53
intersection
-0.52
POSITIVE LOGITS
anski
0.78
cz
0.75
animous
0.71
éŃĶ
0.70
代
0.68
ansky
0.65
ovi
0.64
ski
0.64
ãĤ±
0.64
arella
0.63
Activations Density 0.125%