INDEX
Explanations
Francisco
This neuron detects mentions of “San Francisco” (including variations like “San Francisco Bay Area”).
New Auto-Interp
Negative Logits
-opt
-0.08
(em
-0.07
�
-0.06
Jac
-0.06
(ans
-0.06
Lew
-0.06
_led
-0.06
Lena
-0.06
357
-0.06
_again
-0.06
POSITIVE LOGITS
Francisco
0.11
SF
0.08
τσι
0.07
ां
0.07
usando
0.07
dó
0.07
AYOUT
0.07
Đảng
0.07
로부터
0.07
われる
0.07
Activations Density 0.006%