INDEX
Explanations
descriptions of violent actions and confrontations involving multiple individuals
references to violent or aggressive incidents involving groups of people
New Auto-Interp
Negative Logits
mberg
-0.74
ãĤ´ãĥ³
-0.72
Comprehensive
-0.71
ISTORY
-0.70
successes
-0.68
taxpayers
-0.68
immune
-0.65
waterways
-0.64
reinstated
-0.62
virtues
-0.62
POSITIVE LOGITS
looking
0.94
resembling
0.83
approached
0.82
wielding
0.80
dressed
0.79
approaching
0.75
nikov
0.73
masked
0.73
unidentified
0.72
barg
0.72
Activations Density 0.436%