INDEX
Explanations
words related to a specific person named "Sal"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1323
+0.16
0.7%
544
+0.15
0.7%
501
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1323
+0.16
0.03
544
+0.15
0.03
1387
+0.12
0.02
Negative Logits
ējās
-0.55
posób
-0.55
otheby
-0.55
obstin
-0.53
ardour
-0.53
apprehen
-0.52
thoughtless
-0.52
barbarous
-0.51
belliger
-0.50
repug
-0.50
POSITIVE LOGITS
Sal
1.45
Sal
1.34
sal
1.33
SAL
1.24
sal
1.24
SAL
1.14
Salman
0.86
Sall
0.81
Salon
0.81
Sali
0.78
Activations Density 0.102%