INDEX
Explanations
the presence of the word "by" indicating authorship or agency in a text
New Auto-Interp
Negative Logits
poon
-0.15
alom
-0.15
Crossing
-0.14
rug
-0.14
undi
-0.14
inally
-0.14
poons
-0.13
hos
-0.13
ãĥ¼ãĥķ
-0.13
_POINTER
-0.13
POSITIVE LOGITS
itself
0.17
isi
0.17
odb
0.15
/of
0.15
amba
0.14
Į¨
0.14
chance
0.14
vre
0.14
definition
0.14
ali
0.14
Activations Density 0.044%