INDEX
Explanations
countries, political organizations, and geographic locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.41
1.5%
1577
+0.14
0.5%
1959
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
50
+0.41
0.17
227
+0.14
0.19
1177
+0.08
0.09
Negative Logits
ⓧ
-0.83
trattano
-0.76
település
-0.75
💼
-0.73
revisor
-0.72
-0.72
citazioni
-0.70
getNombre
-0.68
emot
-0.66
abstracta
-0.66
POSITIVE LOGITS
Bartholo
1.06
McLaugh
1.05
sophistic
1.02
impra
1.00
Thier
0.97
unspeak
0.96
Theile
0.95
Gorb
0.95
Rine
0.94
maneu
0.94
Activations Density 3.218%