INDEX
Explanations
The neuron is searching for specific text patterns or phrases beginning with symbols like 'âĢ'
instances of repeated phrases or placeholders in a text
New Auto-Interp
Negative Logits
Kenyan
-0.74
vt
-0.69
Mansion
-0.67
liest
-0.64
lim
-0.63
Saga
-0.62
Recovery
-0.62
Trip
-0.60
è£ħ
-0.60
Mun
-0.60
POSITIVE LOGITS
£
0.88
º
0.87
âģ
0.81
¹
0.80
Ì
0.79
abad
0.79
âĹ¼
0.77
acca
0.74
¬
0.72
Pg
0.70
Activations Density 0.036%