INDEX
Explanations
websites related to political news and opinion
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.23
0.7%
1978
+0.14
0.4%
382
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.23
0.04
382
+0.14
0.04
1570
+0.13
0.03
Negative Logits
impra
-1.08
thut
-1.05
inext
-1.05
scrat
-1.00
oleo
-0.99
shenan
-0.98
intermitt
-0.97
jurassic
-0.96
unve
-0.96
ecru
-0.95
POSITIVE LOGITS
9
0.98
7
0.96
6
0.95
8
0.94
5
0.94
4
0.93
3
0.88
0
0.86
2
0.85
1
0.84
Activations Density 0.130%