INDEX
Explanations
terms related to technology and products
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
101
+0.09
0.3%
1937
+0.08
0.3%
241
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1120
+0.09
0.04
783
+0.08
0.03
823
+0.08
0.03
Negative Logits
<bos>
-1.05
relocate
-0.68
put
-0.66
defray
-0.63
reimburse
-0.61
public
-0.61
abolish
-0.60
cancel
-0.59
prioritize
-0.59
expel
-0.59
POSITIVE LOGITS
applau
1.32
multicolore
1.18
roul
1.15
rafra
1.15
matel
1.15
bourgeo
1.14
sappi
1.13
😭😭
1.12
!...
1.11
casio
1.11
Activations Density 0.282%