INDEX
Explanations
specific sports teams and sports-related terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
227
+0.23
0.7%
1013
+0.11
0.3%
1741
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.23
0.10
245
+0.11
0.03
1551
+0.09
0.04
Negative Logits
effe
-1.60
unden
-1.59
ftu
-1.58
aen
-1.58
impra
-1.57
affor
-1.57
swarovski
-1.56
increa
-1.55
encomp
-1.55
embodi
-1.53
POSITIVE LOGITS
during
0.80
.
0.77
since
0.76
after
0.76
where
0.72
before
0.71
until
0.70
,
0.69
;
0.68
while
0.67
Activations Density 0.499%