INDEX
Explanations
mentions of wars
references to wars
New Auto-Interp
Negative Logits
AUT
-0.77
FORMATION
-0.70
OGR
-0.67
ostics
-0.65
Safety
-0.64
Asset
-0.64
UCT
-0.63
YL
-0.62
osis
-0.61
SOURCE
-0.61
POSITIVE LOGITS
hips
1.11
wars
1.04
lords
1.00
bucks
0.96
hip
0.94
poons
0.91
riors
0.89
waged
0.82
raged
0.79
battles
0.79
Activations Density 0.010%