INDEX
Explanations
types and categories of objects or substances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
17
+0.24
1.6%
47
+0.21
1.4%
198
+0.19
1.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
17
+0.24
0.02
47
+0.21
0.14
198
+0.19
0.09
Negative Logits
ª
-2.86
¦
-2.85
©
-2.74
¸
-2.73
§
-2.68
Ĩ
-2.64
¨
-2.62
Ļ
-2.62
¿½
-2.61
¯
-2.58
POSITIVE LOGITS
ents
1.42
ouble
1.39
arity
1.38
uple
1.38
footprint
1.38
Rapids
1.37
hedron
1.36
orphism
1.34
ellation
1.34
context
1.34
Activations Density 2.183%