INDEX
Explanations
Biographical/historical texts
The neuron activates on numeric date tokens (years and year‐ranges) in the text.
New Auto-Interp
Negative Logits
Tut
-0.07
Arts
-0.06
arts
-0.06
laştır
-0.06
七
-0.06
Pep
-0.06
přátel
-0.06
기준
-0.06
Estimates
-0.06
uros
-0.06
POSITIVE LOGITS
browsing
0.07
rall
0.06
-rule
0.06
rematch
0.06
dood
0.06
disin
0.06
clause
0.06
igidBody
0.06
alm
0.06
…↵↵↵
0.06
Activations Density 0.029%