INDEX
Explanations
discourse markers and conversational responses
New Auto-Interp
Negative Logits
فريبيس
-0.94
'\\;'
-0.78
BRARY
-0.78
aarrggbb
-0.77
AssemblyCulture
-0.77
leſs
-0.76
BoxDecoration
-0.75
Paglinawan
-0.74
ſelves
-0.74
IBLIO
-0.72
POSITIVE LOGITS
And
0.68
I
0.61
Yeah
0.60
Oh
0.60
Yeah
0.58
And
0.58
That
0.57
yeah
0.56
Yes
0.56
Yes
0.56
Activations Density 0.154%