INDEX
Explanations
phrases and words related to contractual agreements or agreements in general
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.10
0.4%
1870
+0.06
0.3%
844
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
844
+0.10
0.04
1593
+0.06
0.03
286
+0.06
0.03
Negative Logits
<bos>
-1.86
ⓧ
-0.84
<tfoot>
-0.77
-0.75
േ
-0.71
ുറ
-0.71
/*
-0.70
enumerate
-0.69
ടു
-0.69
-0.68
POSITIVE LOGITS
disagre
2.33
affor
2.33
aen
2.31
increa
2.30
squa
2.27
maneu
2.25
emphat
2.25
fta
2.24
impra
2.21
strick
2.20
Activations Density 0.063%