INDEX
Explanations
direct or indirect requests or calls to action
imperative phrases that suggest requesting information or inquiries
New Auto-Interp
Negative Logits
Ĥ¬
-0.75
cutting
-0.73
rongh
-0.68
rient
-0.67
ffen
-0.66
pite
-0.66
lim
-0.65
swing
-0.65
zinski
-0.65
abama
-0.64
POSITIVE LOGITS
naires
1.09
questions
1.02
rhet
1.01
probing
0.95
wered
0.87
asked
0.86
answered
0.83
answ
0.82
erville
0.81
politely
0.80
Activations Density 0.047%