INDEX
Explanations
dates and numerical information like percentages and statistics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1271
+0.14
0.5%
667
+0.13
0.5%
1978
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
321
+0.14
0.09
1271
+0.13
0.08
667
+0.13
0.07
Negative Logits
abestanden
-0.61
Diweddarwch
-0.59
Cyfarwyddwr
-0.58
ivelany
-0.57
Αν
-0.56
Reino
-0.54
بيها
-0.54
ἔ
-0.53
djangoproject
-0.53
Wichtig
-0.53
POSITIVE LOGITS
strick
1.02
increa
1.02
effe
0.97
affor
0.94
suscep
0.93
guarante
0.93
purcha
0.92
?...
0.92
fuf
0.92
impra
0.92
Activations Density 0.252%