INDEX
Explanations
formatting errors in input fields, specifically related to email addresses
punctuation marks, particularly periods indicating the end of sentences
New Auto-Interp
Negative Logits
oun
-0.78
brut
-0.76
additive
-0.73
homebrew
-0.72
bos
-0.71
gobl
-0.69
manif
-0.69
playable
-0.69
ouf
-0.69
anus
-0.68
POSITIVE LOGITS
Please
1.07
php
0.95
dll
0.93
html
0.93
tumblr
0.90
0.89
push
0.87
jpg
0.87
aspx
0.86
However
0.86
Activations Density 0.195%