INDEX
Explanations
code snippets
The neuron strongly activates on long runs of the same token repeated over and over, i.e. monotonous repetition sequences.
New Auto-Interp
Negative Logits
Illum
-0.07
�
-0.07
حذ
-0.07
�
-0.07
Exterior
-0.07
亜
-0.06
�
-0.06
�
-0.06
릴
-0.06
&display
-0.06
POSITIVE LOGITS
CITY
0.07
hearty
0.07
932
0.06
ój
0.06
dirname
0.06
AMENT
0.06
内の
0.06
그래서
0.06
Admission
0.06
@Repository
0.06
Activations Density 0.017%