INDEX
Explanations
phrases related to mental health and mental illnesses
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.12
0.4%
1950
+0.10
0.4%
1310
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1708
+0.12
0.03
1950
+0.10
0.03
251
+0.10
0.02
Negative Logits
disreg
-0.58
kras
-0.57
hej
-0.57
stak
-0.56
doub
-0.52
kade
-0.51
bej
-0.50
Cormack
-0.50
Stadion
-0.50
sneeze
-0.49
POSITIVE LOGITS
mental
1.25
Mental
1.21
Mental
1.17
mental
1.02
MENTAL
0.96
mentally
0.83
Ment
0.66
psychiatric
0.59
impegno
0.59
Ment
0.58
Activations Density 0.052%