INDEX
Explanations
references to speakers in political or formal settings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.14
0.8%
411
+0.12
0.7%
2011
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1805
+0.14
0.02
783
+0.12
0.02
1964
+0.10
0.02
Negative Logits
<bos>
-2.88
ⓧ
-1.06
-0.95
<?
-0.89
/*
-0.81
/**
-0.79
/***
-0.68
Vegeu
-0.66
Enllaços
-0.59
#![
-0.59
POSITIVE LOGITS
bandung
1.27
dises
1.27
speaker
1.24
Speakers
1.22
Czechos
1.18
Minang
1.17
SPEAKER
1.17
Meksi
1.16
alpes
1.15
Speaker
1.15
Activations Density 0.103%