INDEX
Explanations
alcoholic beverages and music-related terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
184
+0.29
1.1%
1343
+0.22
0.8%
964
+0.19
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.29
0.03
1343
+0.22
0.06
612
+0.19
0.01
Negative Logits
impractica
-1.18
unlaw
-1.14
reluct
-1.12
philanth
-1.10
indestru
-1.08
pamph
-1.07
unve
-1.07
seclu
-1.06
impra
-1.06
disagre
-1.06
POSITIVE LOGITS
kloped
0.70
quias
0.68
RectangleBorder
0.66
LayoutConstraint
0.63
InputBorder
0.63
Về
0.58
eorum
0.56
Nhưng
0.55
extAlignment
0.54
xase
0.54
Activations Density 0.295%