INDEX
Explanations
phrases related to medical conditions and treatments, particularly cancer
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
67
+0.13
0.4%
1392
+0.12
0.4%
1296
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
67
+0.13
0.02
1392
+0.12
0.02
890
+0.11
0.02
Negative Logits
disreg
-0.66
liberi
-0.65
PLW
-0.63
burberry
-0.61
dilap
-0.60
Squal
-0.56
quoique
-0.54
encomp
-0.54
indsay
-0.53
claudia
-0.53
POSITIVE LOGITS
cancer
1.33
Cancer
1.22
cancer
1.21
Cancer
1.19
CANCER
1.04
cancers
0.98
tumor
0.87
áncer
0.78
saar
0.78
onco
0.77
Activations Density 0.061%