INDEX
Explanations
instances of content categorization and tagging in a document
New Auto-Interp
Negative Logits
694
-0.16
ija
-0.15
antics
-0.14
edin
-0.14
pagesize
-0.14
hole
-0.14
det
-0.14
orque
-0.14
anche
-0.14
Gregory
-0.13
POSITIVE LOGITS
adera
0.17
enario
0.16
">//
0.15
viol
0.15
Beg
0.15
Weston
0.14
sth
0.14
Howell
0.14
.MODE
0.14
sic
0.13
Activations Density 0.002%