INDEX
Explanations
phrases related to authorship or attribution in text
New Auto-Interp
Negative Logits
used
-0.16
cá
-0.15
adj
-0.14
Patron
-0.14
htags
-0.13
excess
-0.13
patron
-0.13
má
-0.13
anders
-0.13
Esper
-0.13
POSITIVE LOGITS
admin
0.25
admin
0.25
Admin
0.21
Admin
0.20
admins
0.19
_admin
0.17
admin
0.17
rodi
0.17
Administrator
0.17
interop
0.16
Activations Density 0.012%