INDEX
Explanations
phrases related to tragic events
references to political or social controversies
New Auto-Interp
Negative Logits
ability
-0.73
neighb
-0.70
duty
-0.69
eligible
-0.66
lifes
-0.66
hatch
-0.64
undet
-0.64
dra
-0.64
xual
-0.64
Aval
-0.64
POSITIVE LOGITS
Advertisement
1.35
Whether
1.18
Much
1.16
Some
1.16
Perhaps
1.16
Besides
1.14
But
1.14
Since
1.13
Nevertheless
1.13
Related
1.12
Activations Density 0.755%