INDEX
Explanations
the letter "e" occurring as a single token
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
872
+0.11
0.3%
876
+0.11
0.3%
1473
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
484
+0.11
0.01
1256
+0.11
0.02
950
+0.09
0.02
Negative Logits
Ekster
-0.78
pymongo
-0.69
scalatest
-0.68
Przyp
-0.67
Alamofire
-0.67
bouncycastle
-0.65
Vaata
-0.63
Eksteraj
-0.63
Leírás
-0.62
testng
-0.62
POSITIVE LOGITS
scrat
1.06
disagre
1.05
maneu
1.01
increa
1.01
intersper
1.00
impra
0.99
accla
0.96
suscep
0.96
inev
0.95
disgra
0.93
Activations Density 0.048%