INDEX
Explanations
names of individuals, potentially authors or public figures
references to specific individuals or notable figures
New Auto-Interp
Negative Logits
vous
-0.75
ez
-0.73
reflex
-0.71
otle
-0.70
vez
-0.67
crew
-0.61
due
-0.60
orius
-0.59
ulic
-0.59
Lab
-0.58
POSITIVE LOGITS
owship
1.20
atio
0.86
uates
0.76
Beckham
0.74
iquette
0.73
vu
0.72
berra
0.72
ername
0.71
insk
0.70
ibur
0.70
Activations Density 0.054%