INDEX
Explanations
sports-related information, focusing on team names and player updates
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.13
0.4%
1177
+0.11
0.3%
736
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1965
+0.13
0.05
648
+0.11
0.03
632
+0.09
0.03
Negative Logits
Ottobre
-1.03
Settembre
-1.00
Luglio
-0.98
Giugno
-0.96
cammin
-0.96
particolar
-0.91
affez
-0.90
esplor
-0.89
sovra
-0.86
affitto
-0.85
POSITIVE LOGITS
[])
0.56
panik
0.54
’
0.53
disiplin
0.51
'
0.51
SPOILER
0.50
Xr
0.50
McLaugh
0.49
team
0.49
][-
0.48
Activations Density 0.179%