INDEX
Explanations
affirmations or agreements in discourse
New Auto-Interp
Negative Logits
Morin
-0.63
MAS
-0.62
Amis
-0.61
yns
-0.59
ASC
-0.59
:///
-0.56
ATS
-0.56
atre
-0.56
ルの
-0.55
TOS
-0.55
POSITIVE LOGITS
Yeah
1.70
Yeah
1.68
YEAH
1.66
yeah
1.60
YEAH
1.51
yeah
1.50
Yea
1.08
Yea
1.06
Nah
1.01
eah
1.00
Activations Density 0.031%