INDEX
Explanations
phrases related to commercial transactions and location-based services
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
231
+0.12
0.6%
198
+0.10
0.6%
35
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.12
0.07
162
+0.10
0.08
55
+0.10
0.07
Negative Logits
¯
-1.70
ash
-1.69
)].
-1.62
myself
-1.58
ĥ½
-1.58
´
-1.55
:&
-1.53
µ
-1.53
¦
-1.52
ı
-1.49
POSITIVE LOGITS
dating
1.74
going
1.64
:`
1.57
tes
1.56
---|
1.56
worthy
1.54
glass
1.51
amazon
1.51
watson
1.50
recht
1.48
Activations Density 0.792%