INDEX
Explanations
sentences related to societal and political commentary
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2019
+0.13
0.4%
1741
+0.11
0.3%
382
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1265
+0.13
0.05
382
+0.11
0.06
1173
+0.11
0.04
Negative Logits
istan
-1.22
makro
-1.18
lampa
-1.17
teras
-1.16
riva
-1.15
parati
-1.15
sement
-1.14
maroc
-1.13
kela
-1.12
balon
-1.11
POSITIVE LOGITS
however
0.98
though
0.88
although
0.79
meanwhile
0.76
which
0.75
albeit
0.74
however
0.74
unfortunately
0.73
especially
0.72
despite
0.68
Activations Density 0.282%