INDEX
Explanations
Defining a process
This neuron activates on definition-style phrases—tokens like “refers to,” “is the process of,” or “meaning” that introduce or signal a definition.
New Auto-Interp
Negative Logits
growth
-0.07
Pandora
-0.06
xiety
-0.06
Photon
-0.06
Safety
-0.06
지원
-0.06
ator
-0.06
dispro
-0.06
信
-0.05
economics
-0.05
POSITIVE LOGITS
oves
0.07
/gr
0.07
IRTUAL
0.07
verilm
0.07
FROM
0.07
enses
0.07
undra
0.07
.columnHeader
0.07
zahrn
0.06
nejlepší
0.06
Activations Density 0.087%