INDEX
Explanations
words related to awards and achievements in the context of sports
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
599
+0.10
0.3%
764
+0.08
0.2%
1499
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
77
+0.10
0.03
718
+0.08
0.02
761
+0.08
0.02
Negative Logits
tetrach
-0.89
effe
-0.88
shewn
-0.86
liberality
-0.84
anhyd
-0.83
pollut
-0.80
nutella
-0.80
onely
-0.80
unlaw
-0.79
fta
-0.79
POSITIVE LOGITS
konkre
0.88
kosme
0.79
konserv
0.73
kompati
0.72
distrik
0.72
infrastruktur
0.71
minimalis
0.69
praktik
0.69
kriminal
0.69
vertik
0.68
Activations Density 0.118%