INDEX
Explanations
mentions of people's names or locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1092
+0.14
0.5%
938
+0.14
0.5%
314
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.14
0.06
1092
+0.14
0.05
314
+0.12
0.03
Negative Logits
anueva
-0.51
CascadeType
-0.50
ējās
-0.48
lossene
-0.47
amphi
-0.47
ränkt
-0.46
arkhand
-0.46
morphe
-0.46
chert
-0.46
arterio
-0.45
POSITIVE LOGITS
fatis
0.86
kristal
0.85
al
0.81
Spal
0.79
Mémoires
0.78
spal
0.76
solidar
0.76
eccl
0.75
schal
0.75
salu
0.73
Activations Density 0.266%