INDEX
Explanations
phrases related to persuading or convincing someone
New Auto-Interp
Negative Logits
nam
-0.76
practice
-0.72
endor
-0.70
eworthy
-0.70
abytes
-0.70
OIL
-0.69
alm
-0.69
lain
-0.67
emale
-0.66
owment
-0.66
POSITIVE LOGITS
tale
0.90
ingly
0.83
skeptics
0.82
convinc
0.79
reluctant
0.77
convince
0.75
voters
0.73
ments
0.71
policymakers
0.70
me
0.70
Activations Density 0.022%