INDEX
Explanations
phrases related to the main causes of specific negative outcomes, such as death or injuries
phrases that discuss causes and contributing factors related to various issues
New Auto-Interp
Negative Logits
tti
-0.81
rocal
-0.79
Å¡
-0.77
zzy
-0.72
vind
-0.69
nces
-0.69
Cosponsors
-0.69
acet
-0.68
poke
-0.66
tto
-0.66
POSITIVE LOGITS
amen
0.74
ours
0.72
contention
0.70
humankind
0.66
imaginable
0.66
ever
0.65
ever
0.62
mankind
0.62
endeavor
0.61
earch
0.61
Activations Density 0.234%