INDEX
Explanations
terms related to health, biology, and medical conditions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
27
+0.16
0.9%
395
+0.14
0.8%
23
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
27
+0.16
0.03
489
+0.14
0.04
401
+0.13
0.09
Negative Logits
Ļª
-2.96
¤
-2.80
³
-2.77
½
-2.76
´
-2.72
Ŀ
-2.71
ŀ
-2.70
»¿
-2.70
¯
-2.66
Ļ
-2.62
POSITIVE LOGITS
rapy
1.59
scal
1.55
inned
1.52
below
1.52
liner
1.50
user
1.48
depression
1.48
duplicates
1.48
first
1.45
well
1.44
Activations Density 1.046%