INDEX
Explanations
expressions related to numbers
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1967
+0.12
0.4%
453
+0.11
0.3%
1984
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1373
+0.12
0.01
1984
+0.11
0.03
256
+0.10
0.03
Negative Logits
coté
-0.71
vainqueur
-0.70
shenan
-0.69
éto
-0.68
lapin
-0.67
unspeak
-0.67
désert
-0.67
delà
-0.66
levier
-0.66
phare
-0.66
POSITIVE LOGITS
sappi
0.67
glance
0.64
vecin
0.53
vece
0.53
ridu
0.53
perif
0.52
ideolog
0.51
complic
0.50
democra
0.50
volete
0.50
Activations Density 0.075%