INDEX
Explanations
words related to political figures and locations
names of individuals, particularly in the context of locations and titles
New Auto-Interp
Negative Logits
+/-
-0.65
plays
-0.64
SPACE
-0.63
caps
-0.62
effic
-0.61
xp
-0.61
diligence
-0.60
sense
-0.58
=-=-=-=-
-0.57
culp
-0.57
POSITIVE LOGITS
adra
0.75
west
0.74
elaide
0.72
iste
0.71
HR
0.68
ascus
0.68
endment
0.67
ikh
0.66
riors
0.66
Stores
0.66
Activations Density 0.348%