INDEX
Explanations
information related to electronic communication, especially email addresses
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.23
0.8%
1150
+0.16
0.6%
1343
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
872
+0.23
0.09
1843
+0.16
0.05
1150
+0.15
0.02
Negative Logits
ftre
-1.07
fta
-1.03
ftu
-1.00
fup
-0.99
fto
-0.96
fays
-0.92
vns
-0.92
thut
-0.92
tew
-0.91
aen
-0.89
POSITIVE LOGITS
rungsseite
0.57
please
0.52
parsedMessage
0.49
***!
0.49
पया
0.49
wła
0.49
retraso
0.49
дописавши
0.47
below
0.46
apologize
0.46
Activations Density 1.204%