INDEX
Explanations
phrases related to study outcomes or results, with a focus on specific settings and circumstances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1379
+0.12
0.4%
1034
+0.12
0.4%
468
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1708
+0.12
0.02
1379
+0.12
0.02
1173
+0.12
0.02
Negative Logits
Juf
-0.66
attirer
-0.62
marquer
-0.60
reluct
-0.58
résister
-0.58
Compañ
-0.58
Ceux
-0.58
étu
-0.57
LIRE
-0.56
Madi
-0.55
POSITIVE LOGITS
outcome
1.24
outcomes
1.17
outcome
1.15
Outcome
1.12
outcomes
1.09
Outcomes
1.03
Outcome
1.03
Outcomes
0.96
OUTCOMES
0.84
kaos
0.72
Activations Density 0.088%