INDEX
Explanations
proper nouns related to politics and individuals in the context of investigations
mentions of specific names or figures, particularly political ones
New Auto-Interp
Negative Logits
ERAL
-0.81
cation
-0.69
DCS
-0.69
hered
-0.68
hetto
-0.68
CoC
-0.67
iary
-0.67
reet
-0.64
ician
-0.62
imates
-0.62
POSITIVE LOGITS
xus
0.91
Rousse
0.79
yss
0.75
bra
0.73
stal
0.72
lete
0.70
acl
0.69
thia
0.69
lda
0.68
rl
0.68
Activations Density 0.031%