INDEX
Explanations
instances of words related to physical injuries
mentions of injuries
New Auto-Interp
Negative Logits
zee
-0.77
zees
-0.76
avanaugh
-0.71
ullivan
-0.71
rows
-0.71
ILE
-0.71
@@@@
-0.69
itudes
-0.66
ģĸ
-0.66
ihu
-0.66
POSITIVE LOGITS
prevention
0.92
inflicted
0.91
injury
0.89
incurred
0.84
prone
0.82
replacements
0.82
injuries
0.80
suffered
0.79
riddled
0.79
plagued
0.78
Activations Density 0.027%