INDEX
Explanations
references to media and journalism, particularly in the context of censorship and press credentials
New Auto-Interp
Negative Logits
igel
-0.16
μι
-0.15
cury
-0.14
opoulos
-0.14
—
-0.14
Mrs
-0.13
ampus
-0.13
autos
-0.13
accidentally
-0.13
ziej
-0.13
POSITIVE LOGITS
CP
0.32
CP
0.28
-CP
0.24
cp
0.23
.cp
0.23
_CP
0.22
(cp
0.22
_cp
0.22
,cp
0.22
cp
0.21
Activations Density 0.002%