INDEX
Explanations
dialogue interjections and affirmations
New Auto-Interp
Negative Logits
,
0.92
،
0.85
、
0.83
.--
0.81
Также
0.79
Ainsi
0.77
,…
0.76
Также
0.76
,--
0.75
Additionally
0.75
POSITIVE LOGITS
yeah
1.33
huh
1.18
eh
1.05
but
1.03
yep
1.01
alright
1.01
okay
1.00
yes
1.00
haha
0.98
aye
0.95
Activations Density 0.608%