INDEX
Explanations
names of newspapers and media outlets
New Auto-Interp
Negative Logits
098
-0.17
vek
-0.15
ignon
-0.15
keh
-0.15
097
-0.15
445
-0.15
lys
-0.14
umar
-0.14
107
-0.14
rec
-0.14
POSITIVE LOGITS
/Dk
0.19
ousel
0.18
ÃĹ↵↵
0.17
ãĥĭãĥ¼
0.17
argout
0.16
outlet
0.16
exclusively
0.16
.scalablytyped
0.15
interviewer
0.15
odo
0.14
Activations Density 0.027%