INDEX
Explanations
specific team names and player references in sports contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.28
1.2%
1577
+0.16
0.7%
1919
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.28
0.21
1097
+0.16
0.20
862
+0.13
0.12
Negative Logits
intersper
-1.16
trouva
-1.04
<bos>
-1.04
Augu
-1.01
sii
-1.00
purcha
-0.99
fup
-0.98
trasparente
-0.95
fta
-0.95
mef
-0.95
POSITIVE LOGITS
VYMaps
0.81
==""){0.71
todella
0.70
)++;
0.69
িখ
0.66
Савезне
0.65
wiem
0.64
jestli
0.63
trei
0.63
didn
0.63
Activations Density 2.545%