INDEX
Explanations
references to individuals and their actions or characteristics, especially in the context of conflict or violence
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.08
3:0.06
4:0.25
5:0.02
6:0.05
7:0.30
8:0.03
9:0.03
10:0.05
11:0.04
Negative Logits
imester
-1.94
iland
-1.83
ommel
-1.77
lege
-1.63
yssey
-1.62
legates
-1.59
answered
-1.54
akis
-1.49
Janeiro
-1.49
writers
-1.47
POSITIVE LOGITS
tom
1.66
indiscrim
1.57
mortar
1.53
plaster
1.50
retali
1.45
sarc
1.44
cannon
1.43
cannons
1.43
disproportionate
1.43
Interstitial
1.41
Activations Density 0.003%