INDEX
Explanations
female names and emotional expressions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.09
0.2%
1795
+0.08
0.2%
272
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1795
+0.09
0.05
862
+0.08
0.03
588
+0.08
0.06
Negative Logits
inev
-1.82
effe
-1.79
fuf
-1.78
aen
-1.78
fta
-1.78
increa
-1.77
guarante
-1.73
secon
-1.73
wien
-1.72
fte
-1.71
POSITIVE LOGITS
ready
0.77
available
0.77
ήσει
0.69
styleType
0.66
done
0.66
installed
0.65
ългария
0.65
EndProject
0.65
lined
0.64
underway
0.64
Activations Density 0.448%