INDEX
Explanations
The neuron fires on subword pieces of verbs meaning “to believe” (e.g. German glaub- and Romance cred-/creer-) across contexts.
New Auto-Interp
Negative Logits
<Result
-0.07
requirements
-0.07
[[]
-0.06
дек
-0.06
Bd
-0.06
Phys
-0.06
(name
-0.06
Kitty
-0.06
defiance
-0.06
train
-0.06
POSITIVE LOGITS
uant
0.07
못
0.07
�
0.07
Crowley
0.07
romium
0.07
believe
0.06
сто
0.06
UDA
0.06
existing
0.06
Fundamental
0.06
Activations Density 0.023%