INDEX
Explanations
phrases related to layouts and designs
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
24
+0.14
0.4%
1870
+0.13
0.4%
332
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
24
+0.14
0.07
1705
+0.13
0.06
1870
+0.10
0.04
Negative Logits
Khart
-0.84
Darío
-0.83
Juf
-0.82
Mlle
-0.81
migli
-0.78
Immig
-0.78
Malte
-0.78
embra
-0.78
Intere
-0.77
Hez
-0.77
POSITIVE LOGITS
changes
0.60
layout
0.59
patterns
0.58
布局
0.56
pattern
0.56
changed
0.55
habits
0.55
arrangements
0.54
utafitiHapana
0.54
assumptions
0.54
Activations Density 0.449%