INDEX
Explanations
phrases indicating approval or support
expressions related to voting in favor of proposals or legislation
New Auto-Interp
Negative Logits
Brist
-0.71
tis
-0.69
Gorge
-0.64
LES
-0.64
ı
-0.63
legged
-0.63
ridges
-0.63
liam
-0.61
Pist
-0.61
Torn
-0.60
POSITIVE LOGITS
itism
1.25
ability
0.79
ative
0.76
favoring
0.75
ably
0.75
ality
0.74
uate
0.72
parency
0.71
ibility
0.71
atives
0.71
Activations Density 0.016%