INDEX
Explanations
phrases related to controversial medical practices and ethical debates
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
110
+0.10
0.3%
207
+0.09
0.3%
1843
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
207
+0.10
0.04
110
+0.09
0.03
509
+0.08
0.04
Negative Logits
depic
-1.74
affor
-1.68
unden
-1.68
impra
-1.68
guarante
-1.65
Mlle
-1.64
increa
-1.61
indestru
-1.60
emphat
-1.60
accla
-1.59
POSITIVE LOGITS
Hospice
0.86
death
0.80
dying
0.76
hospice
0.73
death
0.69
terminal
0.69
EoL
0.67
die
0.65
Funeral
0.65
deaths
0.65
Activations Density 0.305%