INDEX
Explanations
references to legal documents and European Parliament content
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.19
0.6%
453
+0.12
0.4%
1778
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
752
+0.19
0.05
16
+0.12
0.06
562
+0.11
0.04
Negative Logits
Nhưng
-0.77
Còn
-0.74
Và
-0.74
Trọng
-0.72
tardes
-0.66
لاثة
-0.66
Trước
-0.65
ECONDS
-0.62
Làm
-0.61
Đây
-0.61
POSITIVE LOGITS
reluct
1.51
impra
1.50
depic
1.46
excru
1.43
unve
1.41
increa
1.38
inev
1.38
accla
1.36
vagu
1.36
secon
1.36
Activations Density 0.248%