INDEX
Explanations
phrases related to healthcare policies and practices
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
186
+0.24
1.4%
156
+0.20
1.2%
47
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
362
+0.24
0.13
490
+0.20
0.10
186
+0.14
0.08
Negative Logits
ème
-1.78
documentclass
-1.60
umab
-1.59
pudding
-1.51
etus
-1.50
cake
-1.48
Boltzmann
-1.41
illustr
-1.40
necklace
-1.38
button
-1.38
POSITIVE LOGITS
Ļª
2.06
ĥ
1.88
Ł
1.63
Īĺ
1.61
¼
1.60
mun
1.53
arma
1.49
particular
1.48
ull
1.47
Ĥ¬
1.45
Activations Density 1.784%