INDEX
Explanations
This neuron fires on the document’s most information‐rich tokens—proper names, numbers, and other content words carrying the core facts.
New Auto-Interp
Negative Logits
Joe
-0.07
cach
-0.06
OCI
-0.06
onaut
-0.06
facilities
-0.06
paint
-0.06
Cour
-0.06
liquid
-0.06
Twig
-0.06
joe
-0.06
POSITIVE LOGITS
.interval
0.07
상세
0.06
kvůli
0.06
-moving
0.06
ีพ
0.06
hostel
0.06
дух
0.06
라마
0.06
длитель
0.06
Müslüman
0.06
Activations Density 0.043%