INDEX
Explanations
verbs related to physical actions involving bending or straightening
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1515
+0.11
0.5%
1328
+0.11
0.5%
50
+0.10
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
690
+0.11
0.03
1328
+0.11
0.03
1385
+0.10
0.03
Negative Logits
<bos>
-2.31
podr
-0.74
Kanpo
-0.72
Pró
-0.69
PrivateRoute
-0.69
ⓧ
-0.69
Galería
-0.68
Bibliograf
-0.68
Espa
-0.67
itemBuilder
-0.67
POSITIVE LOGITS
indestru
1.61
shenan
1.57
stockholm
1.55
increa
1.49
eiffel
1.47
maroc
1.45
thut
1.45
pollut
1.44
unspeak
1.44
madonna
1.43
Activations Density 0.193%