INDEX
Explanations
mathematical equations and expressions represented in text format
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.15
0.5%
204
+0.14
0.5%
874
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
204
+0.15
0.01
874
+0.14
0.01
924
+0.13
0.01
Negative Logits
Punj
-0.64
prek
-0.60
lebte
-0.58
schloss
-0.57
slika
-0.56
schlug
-0.55
kral
-0.52
rief
-0.52
suchte
-0.52
legte
-0.52
POSITIVE LOGITS
quæ
0.99
jacques
0.98
potest
0.97
habet
0.96
0.94
vinci
0.91
ejus
0.90
tanong
0.88
Messieurs
0.87
grati
0.87
Activations Density 0.042%