INDEX
Explanations
punctuation marks at the end of sentences or clauses
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
271
+0.16
0.9%
23
+0.15
0.8%
412
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
258
+0.16
0.04
321
+0.15
0.03
23
+0.14
0.04
Negative Logits
himself
-1.83
otropic
-1.73
herself
-1.66
icum
-1.54
ich
-1.51
asek
-1.50
rpm
-1.48
gart
-1.46
ium
-1.46
ieri
-1.45
POSITIVE LOGITS
Ī
1.61
However
1.61
Secondly
1.56
then
1.52
ĻĤ
1.50
indexOf
1.48
Moreover
1.47
Furthermore
1.46
oxford
1.46
ãģĵãģ®
1.46
Activations Density 0.104%