INDEX
Explanations
phrases containing the word "Yes" and then, often, a statement or affirmation
repetitive affirmations or confirmations in dialogue
New Auto-Interp
Negative Logits
èĢħ
-0.83
arted
-0.82
Reviewed
-0.76
ļéĨĴ
-0.73
igion
-0.72
-+
-0.70
ende
-0.69
)=(
-0.68
è£ħ
-0.68
rance
-0.68
POSITIVE LOGITS
sir
1.04
yes
0.99
yeah
0.91
technically
0.87
please
0.81
indeed
0.81
there
0.81
THERE
0.79
steroids
0.76
it
0.76
Activations Density 0.076%