INDEX
Explanations
affirmations and agreements in conversations
New Auto-Interp
Negative Logits
Stoll
-0.71
Zav
-0.68
Mab
-0.68
Datuak
-0.67
,
-0.67
Chor
-0.67
chocolates
-0.66
ASF
-0.66
Esther
-0.65
TOC
-0.64
POSITIVE LOGITS
Yeah
1.09
Yeah
1.08
guys
1.04
GUYS
1.03
guys
1.01
yeah
0.98
Guys
0.98
guy
0.96
YEAH
0.95
Guys
0.93
Activations Density 0.063%