INDEX
Explanations
references to organizations and entities involved in environmental or social activism
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
184
+0.33
1.2%
50
+0.20
0.7%
453
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.33
0.07
184
+0.20
0.02
1884
+0.12
0.05
Negative Logits
<bos>
-2.77
intios
-0.76
Vidite
-0.68
.
-0.66
↵↵
-0.64
Aholisi
-0.64
(
-0.63
Dizziness
-0.62
,
-0.62
///**
-0.62
POSITIVE LOGITS
saar
1.88
maksi
1.85
meis
1.84
kram
1.81
lele
1.75
pank
1.74
silikon
1.74
makro
1.73
kaos
1.72
kac
1.70
Activations Density 0.385%