INDEX
Explanations
specific structural elements and formatting in text
New Auto-Interp
Negative Logits
ſeveral
-0.73
itſelf
-0.70
виправивши
-0.69
chré
-0.68
quelcon
-0.68
humaines
-0.68
firſt
-0.67
Monfieur
-0.67
sexuales
-0.67
GoogleFonts
-0.66
POSITIVE LOGITS
hirt
0.58
'))
0.52
virons
0.50
core
0.49
Kam
0.49
Core
0.48
asiness
0.48
))));
0.47
)))
0.47
Fe
0.47
Activations Density 1.489%