INDEX
Explanations
dates or specific numbers mentioned in a sentence
the presence of specific suffixes or word endings
New Auto-Interp
Negative Logits
footed
-0.78
ierrez
-0.69
CPR
-0.65
minecraft
-0.64
repre
-0.62
DOI
-0.62
avorite
-0.61
Folder
-0.59
udeb
-0.59
lawy
-0.59
POSITIVE LOGITS
roth
0.99
RAL
0.79
ral
0.78
士
0.69
opol
0.67
enne
0.67
»Ĵ
0.66
Ĥİ
0.64
bor
0.63
DragonMagazine
0.63
Activations Density 0.114%