INDEX
Explanations
phrases mentioning specific locations and military activities in current events
New Auto-Interp
Negative Logits
ea
-0.77
abel
-0.73
atoon
-0.72
venture
-0.72
utan
-0.71
milo
-0.71
nell
-0.70
ettings
-0.70
emption
-0.69
isson
-0.69
POSITIVE LOGITS
entire
1.18
hardest
1.05
unsuspecting
1.02
weakest
1.00
poorest
1.00
nation
0.99
slightest
0.94
same
0.94
Clintons
0.93
entirety
0.93
Activations Density 0.296%