INDEX
Explanations
words related to historical events and locations, especially related to wars and political movements
specific names or terms related to political entities or groups
New Auto-Interp
Negative Logits
timet
-0.63
margins
-0.62
backgrounds
-0.60
gif
-0.59
Attribution
-0.57
trimmed
-0.57
\<
-0.56
ãĤ¦
-0.55
barr
-0.54
traged
-0.53
POSITIVE LOGITS
geon
0.89
hyde
0.85
alion
0.79
bag
0.79
step
0.75
osaurs
0.74
Restaur
0.74
geons
0.73
othy
0.73
eteria
0.72
Activations Density 0.126%