INDEX
Explanations
predictions or estimations about future outcomes
New Auto-Interp
Negative Logits
cession
-0.73
ppo
-0.62
noticed
-0.61
kay
-0.59
orget
-0.59
ateurs
-0.58
rosse
-0.58
estern
-0.58
merce
-0.57
ipop
-0.56
POSITIVE LOGITS
to
0.90
to
0.74
likely
0.71
iant
0.67
capable
0.66
ivalent
0.66
destined
0.65
Cosponsors
0.61
ELF
0.60
TO
0.60
Activations Density 0.348%