INDEX
Explanations
words related to military conflict and warfare strategies
references to various types of warfare
New Auto-Interp
Negative Logits
este
-0.74
val
-0.72
prints
-0.70
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
-0.70
urate
-0.70
Label
-0.69
count
-0.68
otto
-0.67
ergy
-0.67
thumbnails
-0.66
POSITIVE LOGITS
Warfare
1.04
warfare
1.01
fare
0.95
riors
0.82
hysteria
0.81
posture
0.78
waged
0.78
nesday
0.76
readiness
0.75
rior
0.73
Activations Density 0.011%