INDEX
Explanations
HTML and structural elements in web content
New Auto-Interp
Negative Logits
dum
-0.16
illus
-0.15
ategory
-0.15
Pepper
-0.15
eb
-0.15
engin
-0.14
ycl
-0.14
las
-0.14
edback
-0.14
rego
-0.14
POSITIVE LOGITS
lang
0.30
lang
0.27
xmlns
0.23
.lang
0.23
(lang
0.22
LANG
0.21
Lang
0.20
-lang
0.20
/lang
0.20
Lang
0.20
Activations Density 0.005%