INDEX
Explanations
the neuron responds to repeated or boilerplate text—tokens that appear many times in a duplicated/templated phrase or repeatedly reiterated sentence fragments.
New Auto-Interp
Negative Logits
Characteristics
-0.08
shelves
-0.07
offences
-0.07
Roger
-0.07
Guard
-0.06
shelf
-0.06
Ты
-0.06
northern
-0.06
approves
-0.06
redevelopment
-0.06
POSITIVE LOGITS
ะแ
0.06
_TYPED
0.06
.man
0.06
roleum
0.06
zaměř
0.06
RaycastHit
0.06
Specifies
0.06
dní
0.06
scious
0.06
Στο
0.06
Activations Density 0.023%