INDEX
Explanations
mentions of actions or opinions that are considered notable or important
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1691
+0.10
0.3%
1491
+0.09
0.3%
1392
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
642
+0.10
0.03
1691
+0.09
0.03
899
+0.09
0.03
Negative Logits
ToDecimal
-0.55
Ծանոթ
-0.50
Παραπομπές
-0.48
laajenta
-0.46
Str
-0.45
xbe
-0.43
drawSprites
-0.43
xee
-0.43
Jîn
-0.43
Altri
-0.42
POSITIVE LOGITS
sappi
0.88
tupperware
0.87
germain
0.82
hilux
0.80
soigne
0.79
nutella
0.77
matel
0.77
bordeaux
0.76
cabrio
0.76
riviera
0.75
Activations Density 0.153%