INDEX
Explanations
terms related to heart disease or cardiac conditions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.28
1.7%
327
+0.11
0.7%
443
+0.11
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
156
+0.28
0.00
327
+0.11
0.00
184
+0.11
0.00
Negative Logits
cual
-1.65
uffling
-1.50
hes
-1.45
racist
-1.43
woke
-1.38
edes
-1.35
že
-1.32
click
-1.32
fois
-1.32
ção
-1.31
POSITIVE LOGITS
emergencies
1.64
stems
1.59
regimes
1.57
(,
1.46
HV
1.45
break
1.43
infrastructure
1.37
irrigation
1.37
relies
1.36
originates
1.34
Activations Density 0.006%