INDEX
Explanations
URLs and domain references in the text
New Auto-Interp
Negative Logits
jen
-0.17
afone
-0.16
uez
-0.15
kovi
-0.14
/rss
-0.14
jvu
-0.14
Thur
-0.14
Comm
-0.14
stad
-0.13
rias
-0.13
POSITIVE LOGITS
/wp
0.23
/?
0.22
wp
0.21
.au
0.20
lify
0.19
wp
0.18
Blog
0.17
201
0.16
enser
0.16
blog
0.16
Activations Density 0.052%