INDEX
Explanations
health and medical related terms as well as categories and values in a structured format
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1699
+0.16
0.5%
394
+0.16
0.5%
1343
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
764
+0.16
0.05
981
+0.16
0.06
648
+0.13
0.03
Negative Logits
Примеча
-0.72
Hermoso
-0.71
Exelente
-0.69
Souha
-0.69
Czym
-0.66
FTFY
-0.64
Dijo
-0.64
mengg
-0.63
Dział
-0.62
Lmfao
-0.61
POSITIVE LOGITS
vns
0.79
cance
0.76
↔
0.75
aen
0.75
wien
0.74
„,
0.73
effe
0.73
magis
0.72
lii
0.70
ohr
0.70
Activations Density 0.304%