INDEX
Explanations
terminology related to different forms of warfare, such as nuclear, air, chemical, biological, and information warfare
terms and phrases related to warfare, particularly nuclear and chemical contexts
New Auto-Interp
Negative Logits
icles
-0.83
este
-0.81
abet
-0.76
icle
-0.71
acci
-0.68
Label
-0.67
ibli
-0.67
ocent
-0.67
aho
-0.65
ident
-0.65
POSITIVE LOGITS
fare
0.96
rior
0.92
riors
0.87
Warfare
0.84
waged
0.76
eer
0.75
warfare
0.74
STEM
0.73
eers
0.73
posture
0.72
Activations Density 0.043%