INDEX
Explanations
phrases related to train derailments
references to train derailments and related events
New Auto-Interp
Negative Logits
Kinnikuman
-0.79
ternity
-0.78
isations
-0.73
archs
-0.70
ples
-0.70
isdom
-0.68
orneys
-0.68
estate
-0.68
hesda
-0.66
atar
-0.65
POSITIVE LOGITS
derail
1.46
derailed
1.11
heimer
0.84
crew
0.81
wagon
0.80
Amtrak
0.78
Dickinson
0.74
ashtra
0.70
freight
0.69
rupture
0.69
Activations Density 0.029%