INDEX
Explanations
instances of people's names and their associated actions or roles in violent contexts
New Auto-Interp
Negative Logits
Hentet
-0.71
zheimer
-0.61
urlpatterns
-0.60
Pratique
-0.54
addCriterion
-0.53
ibrated
-0.53
OMITBAD
-0.52
<=",
-0.52
StoryboardSegue
-0.52
FormTagHelper
-0.52
POSITIVE LOGITS
victim
0.68
被害
0.62
victims
0.60
unprotected
0.58
víctima
0.56
attacked
0.55
vítima
0.55
Ouch
0.55
Opfer
0.54
victim
0.54
Activations Density 0.395%