INDEX
Explanations
phrases related to sports events and game progress
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1081
+0.11
0.3%
1252
+0.09
0.3%
581
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1081
+0.11
0.07
254
+0.09
0.05
270
+0.08
0.04
Negative Logits
tinte
-0.98
utop
-0.97
migli
-0.93
solidar
-0.93
apparti
-0.92
palio
-0.90
ideolog
-0.89
elek
-0.87
Ottobre
-0.87
impon
-0.85
POSITIVE LOGITS
unspeak
0.82
shenan
0.78
vainly
0.76
impelled
0.76
unve
0.72
ineffec
0.71
of
0.71
apprehen
0.69
roused
0.68
unavoid
0.68
Activations Density 0.338%