INDEX
Explanations
comparative adjectives indicating a higher degree
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
1.2%
528
+0.09
0.6%
976
+0.09
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
976
+0.19
0.10
1872
+0.09
0.09
486
+0.09
0.07
Negative Logits
<bos>
-3.11
HasIndex
-0.73
implement
-0.64
//---
-0.64
adopt
-0.63
obtain
-0.63
ⓧ
-0.63
/**
-0.62
//});
-0.62
/**
-0.62
POSITIVE LOGITS
ecru
1.52
swarovski
1.45
bangkok
1.31
burberry
1.27
stockholm
1.26
eiffel
1.26
venice
1.25
sentra
1.25
riviera
1.25
uniqlo
1.25
Activations Density 0.290%