INDEX
Explanations
email-related phrases, particularly related to receiving emails
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.10
0.3%
678
+0.10
0.3%
1356
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1702
+0.10
0.03
1356
+0.10
0.02
678
+0.09
0.03
Negative Logits
RemoveField
-0.55
isSuccess
-0.52
Etimo
-0.52
paragraphe
-0.50
'&:
-0.48
OSPITAL
-0.46
Folgende
-0.46
carboxylic
-0.46
adduced
-0.46
('');
-0.46
POSITIVE LOGITS
<bos>
0.73
tages
0.66
flotte
0.65
succede
0.65
dimenti
0.65
messe
0.64
notor
0.62
anse
0.61
dimentic
0.58
robus
0.58
Activations Density 0.107%