INDEX
Explanations
references to car accidents and related incidents
New Auto-Interp
Negative Logits
murderers
-0.15
Shooter
-0.14
murdering
-0.14
errupted
-0.14
aria
-0.14
erais
-0.13
murdered
-0.13
poz
-0.13
ournal
-0.13
Killing
-0.13
POSITIVE LOGITS
accident
0.56
acc
0.51
accidents
0.50
Accident
0.45
Acc
0.44
Acc
0.43
_acc
0.42
.acc
0.39
crash
0.38
acc
0.37
Activations Density 0.108%