INDEX
Explanations
long descriptions of physical appearance
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.11
0.3%
736
+0.10
0.3%
964
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.11
0.04
1906
+0.10
0.02
1818
+0.09
0.04
Negative Logits
Cfr
-1.16
allarg
-1.12
dises
-1.12
squa
-1.10
embra
-1.08
socie
-1.08
wien
-1.04
haup
-1.04
Simult
-1.03
bordeaux
-1.03
POSITIVE LOGITS
hair
0.96
hair
0.74
<bos>
0.69
hairstyle
0.69
Hair
0.68
Hair
0.68
haired
0.64
tóc
0.64
HAIR
0.59
hairs
0.58
Activations Density 0.155%