INDEX
Explanations
special characters within code snippets
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
795
+0.15
0.6%
1407
+0.15
0.6%
605
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1407
+0.15
0.02
795
+0.15
0.02
966
+0.14
0.02
Negative Logits
handsome
-0.47
Obsah
-0.45
rumbling
-0.45
Vlast
-0.44
Díky
-0.44
Polen
-0.43
Když
-0.43
Nowak
-0.43
reggae
-0.43
horned
-0.43
POSITIVE LOGITS
">*
0.83
Ottobre
0.78
('*0.77
*}
0.77
*
0.77
Settembre
0.76
germain
0.75
//*
0.73
{*0.73
vaila
0.72
Activations Density 0.093%