INDEX
Explanations
references to formation or establishment of groups and projects
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.31
1.2%
227
+0.16
0.6%
752
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.31
0.10
752
+0.16
0.07
227
+0.13
0.09
Negative Logits
<bos>
-2.68
***!
-0.98
Về
-0.92
Ngoài
-0.91
Nhà
-0.89
|=\
-0.86
Làm
-0.86
Còn
-0.85
public
-0.85
Nhưng
-0.85
POSITIVE LOGITS
affor
2.87
accla
2.74
increa
2.65
impra
2.61
maneu
2.58
reluct
2.50
Juf
2.49
inev
2.48
milf
2.47
strick
2.46
Activations Density 0.776%