INDEX
Explanations
political affiliations and beliefs
personal political beliefs and identity assertions
New Auto-Interp
Negative Logits
salv
-0.73
enthusi
-0.72
brisk
-0.67
culminating
-0.67
mammoth
-0.67
scrimmage
-0.64
conclud
-0.64
contag
-0.63
accelerated
-0.63
¥ŀ
-0.62
POSITIVE LOGITS
I
1.50
I
1.39
My
1.21
My
1.14
myself
1.03
Honestly
1.01
my
1.01
my
0.95
But
0.92
Period
0.92
Activations Density 0.416%