INDEX
Explanations
references to themes of censorship and media criticism
connections related to censorship and media discussions
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.08
5:0.08
6:0.08
7:0.08
8:0.08
9:0.08
10:0.07
11:0.08
Negative Logits
�醒
-3.01
Produ
-2.96
Anim
-2.80
ã
-2.76
Tarant
-2.73
Participation
-2.67
ufact
-2.67
aptic
-2.66
Produ
-2.66
Princ
-2.65
POSITIVE LOGITS
Wallet
3.04
Kitt
2.77
luster
2.77
itte
2.76
broom
2.76
thin
2.62
Rosenthal
2.55
wallet
2.54
mates
2.52
logs
2.51
Activations Density 0.000%