INDEX
Explanations
expressions of apology and personal communication in a conversational context
New Auto-Interp
Negative Logits
<eos>
-0.57
pr
-0.52
اح
-0.48
…
-0.46
↵↵
-0.43
par
-0.42
?
-0.42
pri
-0.42
iral
-0.41
verifyException
-0.41
POSITIVE LOGITS
Efq
1.35
myſelf
1.30
Monfieur
1.13
pleaſure
1.12
Anſ
1.11
becauſe
1.09
ſeveral
1.08
Houſe
1.06
purpoſe
1.05
themſelves
1.05
Activations Density 0.451%