INDEX
Explanations
personal experiences and stories shared in interviews
New Auto-Interp
Negative Logits
'."
-0.79
Sinclair
-0.68
Pai
-0.65
.'"
-0.65
pered
-0.62
]."
-0.61
Pelosi
-0.60
arted
-0.59
insert
-0.59
Dems
-0.59
POSITIVE LOGITS
definitely
0.97
initely
0.96
answer
0.89
laughs
0.88
certainly
0.85
depends
0.81
Laughs
0.80
anecd
0.80
yeah
0.79
Absolutely
0.79
Activations Density 0.306%