INDEX
Explanations
references to domain names and associated services
New Auto-Interp
Negative Logits
bjerg
-0.18
rana
-0.17
eron
-0.16
impse
-0.16
ekten
-0.15
ifar
-0.15
ourcem
-0.15
teborg
-0.14
esian
-0.14
Idol
-0.14
POSITIVE LOGITS
domain
0.47
domains
0.45
Domain
0.39
domain
0.37
_domain
0.35
-domain
0.35
Domain
0.34
domains
0.34
Domains
0.32
DOMAIN
0.32
Activations Density 0.081%