INDEX
Explanations
instances of authorship or attribution in the text
New Auto-Interp
Negative Logits
RegressionTest
-0.62
seamnă
-0.55
ویکیپدی
-0.50
podjela
-0.47
tagHelperRunner
-0.47
estekak
-0.45
operativos
-0.45
aikaa
-0.44
Allociné
-0.44
ReusableCell
-0.44
POSITIVE LOGITS
by
0.82
By
0.81
By
0.80
BY
0.72
by
0.67
BY
0.63
byn
0.53
CreatedBy
0.52
createdBy
0.51
von
0.51
Activations Density 0.199%