INDEX
Explanations
Japanese names and locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1535
+0.08
0.2%
1585
+0.08
0.2%
227
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.08
0.05
227
+0.08
0.04
1585
+0.08
0.02
Negative Logits
abestanden
-0.87
<=",
-0.75
Hiện
-0.69
новништво
-0.67
Được
-0.65
Về
-0.63
ografija
-0.63
ViewFeatures
-0.63
autorytatywna
-0.63
gyhoeddwyd
-0.62
POSITIVE LOGITS
deleter
1.43
compen
1.36
pessi
1.36
contex
1.33
uniqu
1.31
inev
1.28
emphat
1.28
intermitt
1.27
laun
1.26
inappro
1.23
Activations Density 0.183%