INDEX
Explanations
special characters and formatting elements in text
New Auto-Interp
Negative Logits
'){
-0.56
"){
-0.54
D
-0.54
De
-0.52
pat
-0.50
verwijspagina
-0.50
venuto
-0.48
lasciato
-0.48
an
-0.48
D
-0.48
POSITIVE LOGITS
itſelf
1.10
themſelves
0.97
pleaſure
0.95
Monfieur
0.93
Jefus
0.93
himſelf
0.93
Anſ
0.92
Majefty
0.92
poffe
0.91
iſt
0.91
Activations Density 1.206%