INDEX
Explanations
phrases related to social and political issues, particularly focusing on women's rights and equality
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.36
1.2%
1967
+0.17
0.6%
478
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
478
+0.36
0.08
1533
+0.17
0.05
610
+0.15
0.09
Negative Logits
McLaugh
-0.76
heapq
-0.75
unspeak
-0.72
pymysql
-0.70
seperti
-0.69
Vrij
-0.68
shenan
-0.67
mengg
-0.65
disreg
-0.64
apprehen
-0.64
POSITIVE LOGITS
silikon
1.17
alkoh
1.14
sopr
1.13
Settembre
1.07
Luglio
1.06
liev
1.06
karton
1.04
allarg
1.04
Ottobre
1.03
torba
1.01
Activations Density 0.466%