INDEX
Explanations
words related to color and language, particularly abstract color words and descriptive color words
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.14
0.4%
605
+0.11
0.3%
382
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.14
0.06
50
+0.11
0.05
3
+0.10
0.05
Negative Logits
IUrlHelper
-0.82
виправивши
-0.76
<bos>
-0.69
rawDesc
-0.62
RTEE
-0.60
onViewCreated
-0.57
таратура
-0.54
invokeLater
-0.53
SequentialGroup
-0.53
bài
-0.53
POSITIVE LOGITS
stockholm
0.87
roth
0.84
mef
0.83
lyon
0.82
aen
0.82
fep
0.81
psg
0.80
lidl
0.80
doman
0.80
fuj
0.80
Activations Density 0.415%