INDEX
Explanations
information related to medical prescriptions or health concerns
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1705
+0.08
0.3%
2016
+0.07
0.3%
50
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.08
0.08
1080
+0.07
0.06
1703
+0.07
0.05
Negative Logits
-1.02
-1.01
-0.99
-0.98
-0.98
-0.97
-0.96
-0.96
-0.96
-0.95
POSITIVE LOGITS
maneu
3.15
affor
2.94
shenan
2.88
increa
2.87
reluct
2.87
impra
2.83
depic
2.81
scrat
2.78
disagre
2.75
encomp
2.71
Activations Density 2.338%