INDEX
Explanations
facts about achievements or rankings in sports
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1042
+0.12
0.3%
718
+0.08
0.2%
441
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
718
+0.12
0.04
362
+0.08
0.05
441
+0.07
0.04
Negative Logits
glau
-0.58
vogli
-0.58
décid
-0.58
morire
-0.58
décé
-0.56
credere
-0.56
informé
-0.55
parteci
-0.55
preghiera
-0.54
indépendante
-0.53
POSITIVE LOGITS
ranked
0.86
shenan
0.73
deth
0.72
deserving
0.72
deserved
0.70
upvoted
0.68
underval
0.68
ranking
0.68
qualifies
0.67
listed
0.67
Activations Density 0.570%