INDEX
Explanations
actions related to aggressive or forceful behavior, such as storming, invading, or raiding
terms related to aggressive actions or military activities
New Auto-Interp
Negative Logits
rious
-0.79
inguishable
-0.75
ministic
-0.73
reg
-0.73
rag
-0.70
rentice
-0.70
onna
-0.69
reci
-0.68
pronounced
-0.67
ials
-0.67
POSITIVE LOGITS
Britann
0.72
Europe
0.72
Myanmar
0.71
nests
0.71
Stamford
0.69
Bengal
0.68
Luxembourg
0.68
Sussex
0.67
Britain
0.66
unprotected
0.66
Activations Density 0.148%