INDEX
Explanations
mentions of combat-related words and phrases
references to military combat
New Auto-Interp
Negative Logits
gow
-0.78
ħĭ
-0.73
eus
-0.69
bye
-0.68
Else
-0.65
ocument
-0.65
ingen
-0.65
Seym
-0.64
ocl
-0.62
Choice
-0.61
POSITIVE LOGITS
iveness
0.99
ants
0.88
fighting
0.87
ant
0.86
fatig
0.85
ments
0.84
ting
0.82
halla
0.76
prowess
0.75
fighters
0.75
Activations Density 0.026%