INDEX
Explanations
mentions of the name "George," particularly associated with historical or political figures
New Auto-Interp
Negative Logits
ltk
-0.18
chner
-0.16
loha
-0.15
idth
-0.15
yny
-0.15
idad
-0.14
things
-0.14
dle
-0.14
thing
-0.14
nist
-0.13
POSITIVE LOGITS
Orwell
0.28
Washington
0.26
Soros
0.25
HW
0.25
washington
0.21
Washington
0.21
anna
0.21
anne
0.21
Clo
0.21
Bush
0.21
Activations Density 0.015%