INDEX
Explanations
patterns related to medical conditions, policies, and tech issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.12
0.4%
478
+0.10
0.3%
2019
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
862
+0.12
0.02
453
+0.10
0.04
1648
+0.09
0.01
Negative Logits
véhic
-1.22
élar
-1.04
malheureux
-1.01
vieill
-0.99
épu
-0.98
écou
-0.97
déplo
-0.96
écout
-0.96
soulign
-0.95
éprou
-0.92
POSITIVE LOGITS
reales
0.81
diagnosed
0.73
taken
0.72
awarded
0.70
detected
0.70
delivered
0.69
picked
0.69
promoted
0.68
evaluated
0.67
implemented
0.67
Activations Density 0.207%