INDEX
Explanations
descriptions of everyday activities and situations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
190
+0.09
0.3%
906
+0.09
0.3%
1150
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.09
0.07
1438
+0.09
0.04
1328
+0.08
0.05
Negative Logits
maksi
-1.49
hina
-1.45
lele
-1.44
alkoh
-1.42
uhr
-1.37
keramik
-1.36
Meksi
-1.33
optik
-1.33
mef
-1.32
haup
-1.31
POSITIVE LOGITS
belonged
0.72
wasn
0.72
was
0.69
türlü
0.67
Może
0.66
dropped
0.66
znaj
0.64
belongs
0.64
turned
0.62
stored
0.62
Activations Density 0.587%