INDEX
Explanations
contact information, particularly email addresses and Twitter handles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.15
0.4%
1445
+0.15
0.4%
1150
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
126
+0.15
0.03
1445
+0.15
0.03
1676
+0.14
0.02
Negative Logits
javier
-1.18
alberto
-1.11
jorge
-1.11
sergio
-1.10
elena
-1.06
roberto
-1.06
eduardo
-1.05
nicolas
-1.03
lorenzo
-1.02
claudia
-1.01
POSITIVE LOGITS
stickied
0.79
intermitt
0.77
crossfit
0.70
morg
0.66
derma
0.66
dissim
0.65
externs
0.65
subreddits
0.64
rudi
0.63
portables
0.62
Activations Density 0.079%