INDEX
Explanations
phrases related to specific conditions or requirements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
31
+0.13
0.5%
411
+0.13
0.5%
1133
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
411
+0.13
0.03
1133
+0.13
0.03
1548
+0.12
0.02
Negative Logits
Più
-0.53
heça
-0.53
DAR
-0.52
Abbiamo
-0.50
avesse
-0.48
rius
-0.48
Siamo
-0.47
Dar
-0.47
Dar
-0.47
siąż
-0.46
POSITIVE LOGITS
conditions
1.30
Conditions
1.28
condition
1.28
conditions
1.25
Condition
1.24
condition
1.23
Conditions
1.22
Condition
1.15
CONDITION
1.14
CONDITIONS
1.13
Activations Density 0.077%