INDEX
Explanations
phrases indicating approval or agreement
instances of the phrase "Okay" or similar expressions of acknowledgment
New Auto-Interp
Negative Logits
Lago
-0.76
*/(
-0.68
tnc
-0.68
aternity
-0.67
auri
-0.64
atum
-0.62
PT
-0.61
edom
-0.61
prost
-0.60
apers
-0.60
POSITIVE LOGITS
bye
0.93
yeah
0.82
yeah
0.80
zers
0.80
hhhh
0.77
lahoma
0.77
huh
0.77
hhh
0.77
kidding
0.77
prest
0.76
Activations Density 0.058%