INDEX
Explanations
terms related to stances or positions on various issues
references to someone's stance or position on various issues
New Auto-Interp
Negative Logits
batch
-0.75
chemy
-0.66
ãĥ¼ãĥĨãĤ£
-0.66
ombies
-0.63
unsuspecting
-0.62
loo
-0.62
gallery
-0.62
destro
-0.60
stuffing
-0.60
trap
-0.59
POSITIVE LOGITS
stances
1.20
stance
1.17
toward
1.08
regarding
1.06
towards
1.03
favoring
0.97
on
0.92
concerning
0.89
vis
0.89
upholding
0.88
Activations Density 0.180%