INDEX
Explanations
words and punctuation marks related to quotes and conversations
New Auto-Interp
Negative Logits
purpoſe
-0.98
itſelf
-0.96
avoient
-0.95
automatiques
-0.94
zelve
-0.93
Theſe
-0.93
Diſ
-0.92
feroit
-0.91
Shakspeare
-0.90
aveug
-0.90
POSITIVE LOGITS
↵↵
0.75
.
0.72
0.63
0.62
“
0.61
“
0.61
↵
0.61
'
0.58
:
0.58
).
0.58
Activations Density 0.691%