INDEX
Explanations
beginnings
This neuron primarily activates on date and year expressions, spotting temporal references in the text.
New Auto-Interp
Negative Logits
Nope
-0.07
resc
-0.07
forgiven
-0.07
SOUND
-0.06
обнаруж
-0.06
Ст
-0.06
ละคร
-0.06
बज
-0.06
усл
-0.06
ChangeListener
-0.06
POSITIVE LOGITS
(!
0.07
档
0.06
igne
0.06
Sicher
0.06
-router
0.06
angular
0.06
")[
0.06
#${0.06
0.06
#%%
0.06
Activations Density 0.050%