INDEX
Explanations
numerical figures such as statistics and quantities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.12
0.4%
1778
+0.10
0.3%
1385
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1527
+0.12
0.04
609
+0.10
0.04
1516
+0.10
0.03
Negative Logits
település
-0.81
municipi
-0.79
Спољашње
-0.75
Genau
-0.73
Conteúdo
-0.69
Czym
-0.69
Apesar
-0.68
AndEndTag
-0.67
Mesmo
-0.66
ffilm
-0.65
POSITIVE LOGITS
impra
0.56
cushi
0.56
hairc
0.55
ecru
0.54
suscep
0.53
lovel
0.53
disreg
0.52
ousand
0.52
javier
0.52
sergio
0.50
Activations Density 0.106%