INDEX
Explanations
items related to visual formatting specifications
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
225
+0.15
0.8%
354
+0.12
0.7%
369
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
92
+0.15
0.03
410
+0.12
0.04
82
+0.12
0.03
Negative Logits
oarthritis
-1.68
izable
-1.51
prise
-1.49
puted
-1.48
ês
-1.48
realise
-1.43
realize
-1.40
prises
-1.39
·¸
-1.35
escence
-1.34
POSITIVE LOGITS
appa
2.10
linger
1.88
ings
1.74
inger
1.67
gart
1.66
ringer
1.59
lers
1.59
ling
1.55
lem
1.53
tk
1.53
Activations Density 0.322%