INDEX
Explanations
references to specific player names and positions in a sports context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1407
+0.12
0.4%
1092
+0.11
0.3%
555
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1120
+0.12
0.05
690
+0.11
0.06
1343
+0.09
0.05
Negative Logits
<bos>
-0.72
atque
-0.60
rerum
-0.59
ніципалі
-0.57
tehd
-0.55
denim
-0.54
OrNil
-0.54
wikidata
-0.54
fristi
-0.53
roughts
-0.52
POSITIVE LOGITS
Levit
1.41
Kap
1.21
Kap
1.12
Kaplan
1.06
lamborghini
1.00
gsx
0.97
gim
0.97
fup
0.96
levit
0.95
Kah
0.94
Activations Density 0.338%