INDEX
Explanations
references to the present or current state
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
16
+0.12
0.6%
11
+0.11
0.6%
246
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.12
0.04
11
+0.11
0.04
491
+0.11
0.03
Negative Logits
certain
-1.50
á̝
-1.50
Wald
-1.49
particular
-1.43
xe
-1.38
nod
-1.36
Nad
-1.35
ingo
-1.34
prestigious
-1.32
qualified
-1.30
POSITIVE LOGITS
Īĺ
1.88
ķ
1.63
pan
1.62
through
1.60
¾
1.57
generation
1.56
ulence
1.50
RUPT
1.47
through
1.46
planes
1.44
Activations Density 1.915%