INDEX
Explanations
descriptions indicating the action of treating someone or something
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
1.2%
544
+0.18
1.1%
82
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
544
+0.19
0.06
1296
+0.18
0.04
1145
+0.13
0.04
Negative Logits
<bos>
-3.48
//
-0.78
var
-0.77
public
-0.75
AutoScaleMode
-0.75
Пор
-0.75
beta
-0.74
namespace
-0.72
import
-0.72
dom
-0.72
POSITIVE LOGITS
maneu
2.38
Juf
2.35
fta
2.32
increa
2.26
ftu
2.26
thut
2.25
aen
2.24
stockholm
2.23
affor
2.19
accla
2.17
Activations Density 0.114%