INDEX
Explanations
sentence endings
punctuation and statements in the text
New Auto-Interp
Negative Logits
manif
-0.82
oun
-0.82
cod
-0.80
unloaded
-0.72
liquid
-0.71
anus
-0.69
homebrew
-0.69
anka
-0.68
inver
-0.68
modular
-0.67
POSITIVE LOGITS
Please
1.04
php
1.01
push
0.94
aspx
0.93
However
0.93
0.91
html
0.90
Though
0.89
Thanks
0.87
If
0.86
Activations Density 0.144%