INDEX
Explanations
expressions of gratitude and appreciation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.48
1.9%
776
+0.10
0.4%
630
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.48
0.23
1919
+0.10
0.14
1415
+0.08
0.14
Negative Logits
<bos>
-1.60
TagMode
-0.61
通販
-0.57
also
-0.56
GraphicsUnit
-0.56
AddTagHelper
-0.54
Koordinaten
-0.53
also
-0.53
┣
-0.53
)-
-0.52
POSITIVE LOGITS
Keny
1.44
McLaugh
1.37
Bartholo
1.34
Juf
1.32
unlaw
1.28
Jusqu
1.27
Haci
1.26
McInt
1.22
pamph
1.22
Rine
1.21
Activations Density 9.605%