INDEX
Explanations
words related to flavor and taste preferences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
214
+0.14
0.8%
412
+0.12
0.7%
127
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
314
+0.14
0.02
386
+0.12
0.01
127
+0.12
0.03
Negative Logits
Ł
-2.15
´
-1.75
ITED
-1.74
¾
-1.68
onds
-1.66
ypse
-1.62
Ĭ
-1.60
¦
-1.59
µ
-1.58
UIC
-1.54
POSITIVE LOGITS
buds
1.94
bud
1.82
root
1.75
bank
1.67
iness
1.67
wart
1.43
consin
1.40
approving
1.40
^[
1.40
ual
1.37
Activations Density 0.225%