INDEX
Explanations
words related to accessories and objects like bags, laptops, handbags, and wheels
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.0%
1350
+0.14
0.8%
90
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1350
+0.17
0.03
1385
+0.14
0.03
200
+0.12
0.02
Negative Logits
<bos>
-2.68
<?
-0.78
ⓧ
-0.75
/***
-0.73
-0.73
<?
-0.73
subsist
-0.72
defray
-0.67
inaugurate
-0.66
familiarize
-0.65
POSITIVE LOGITS
wheel
1.44
Wheel
1.32
wheel
1.27
wheels
1.24
Wheel
1.23
Wheels
1.12
WHEEL
1.12
WHEEL
1.05
Wheels
1.02
wheels
0.99
Activations Density 0.106%