INDEX
Explanations
text related to encoding, decoding, and technical instructions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.33
1.0%
674
+0.11
0.3%
1899
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
876
+0.33
0.00
2044
+0.11
0.05
1899
+0.09
0.03
Negative Logits
kooper
-1.04
Demokrat
-1.03
elektronik
-1.02
Milán
-1.01
Palestina
-1.01
biografi
-1.00
Irán
-0.97
akade
-0.93
kosme
-0.92
Spanyol
-0.92
POSITIVE LOGITS
encomp
2.62
guarante
2.49
shenan
2.45
increa
2.44
maneu
2.40
impra
2.39
affor
2.38
intersper
2.32
reluct
2.31
scrat
2.27
Activations Density 0.320%