INDEX
Explanations
names or parts of names related to specific individuals or entities
words relating to political figures and their actions
New Auto-Interp
Negative Logits
nesday
-0.81
glers
-0.75
ancial
-0.71
rongh
-0.71
erential
-0.68
auga
-0.66
ankind
-0.65
awed
-0.64
inctions
-0.63
invoke
-0.63
POSITIVE LOGITS
ondo
0.67
tes
0.65
itect
0.64
Odyssey
0.64
inian
0.63
Cortex
0.62
ansas
0.61
ICT
0.60
Lange
0.60
Awakening
0.60
Activations Density 0.084%