INDEX
Explanations
words related to physical trauma and health conditions
references to trauma and its effects
New Auto-Interp
Negative Logits
bye
-0.94
arius
-0.67
verages
-0.67
ramer
-0.66
ateur
-0.65
ovation
-0.65
ebus
-0.65
tarians
-0.65
endar
-0.65
holder
-0.64
POSITIVE LOGITS
trauma
1.05
inflicted
1.02
umatic
0.94
traumatic
0.93
traumat
0.90
traumatic
0.89
wounds
0.86
flashbacks
0.83
imaru
0.83
incurred
0.82
Activations Density 0.029%