INDEX
Explanations
references to traumatic events or accidents, particularly those involving injuries or fatalities
New Auto-Interp
Negative Logits
ics
-0.16
892
-0.15
Fi
-0.15
Interr
-0.15
sse
-0.15
asters
-0.14
atori
-0.14
fh
-0.14
smith
-0.14
Generation
-0.14
POSITIVE LOGITS
incident
0.19
igure
0.16
victim
0.16
rous
0.15
egra
0.15
usan
0.14
.yy
0.14
uintptr
0.14
#__
0.14
czy
0.14
Activations Density 0.092%