INDEX
Explanations
words related to military or aggressive actions
New Auto-Interp
Negative Logits
SPONSORED
-0.67
eminent
-0.66
liest
-0.66
cond
-0.65
cul
-0.64
privately
-0.61
timeless
-0.60
interven
-0.60
wa
-0.59
AUD
-0.59
POSITIVE LOGITS
akery
0.84
ooter
0.84
oks
0.84
aders
0.83
ooters
0.82
olid
0.82
oche
0.80
izard
0.79
angs
0.77
ords
0.76
Activations Density 0.101%