INDEX
Explanations
authorship indicators or references to authors in the text
New Auto-Interp
Negative Logits
apore
-0.15
mlin
-0.15
ander
-0.14
refixer
-0.14
каÑģ
-0.14
endar
-0.14
abyrin
-0.14
.microsoft
-0.14
andr
-0.14
spoken
-0.13
POSITIVE LOGITS
onium
0.15
edd
0.15
burgh
0.15
--,
0.14
_FMT
0.14
edio
0.14
ÑĪиб
0.14
PARSE
0.14
.Fat
0.14
SizeMode
0.14
Activations Density 0.004%