INDEX
    Explanations

    This neuron fires on the document’s most information‐rich tokens—proper names, numbers, and other content words carrying the core facts.

    New Auto-Interp
    Negative Logits
    Joe
    -0.07
     cach
    -0.06
     OCI
    -0.06
    onaut
    -0.06
     facilities
    -0.06
     paint
    -0.06
     Cour
    -0.06
     liquid
    -0.06
     Twig
    -0.06
     joe
    -0.06
    POSITIVE LOGITS
    .interval
    0.07
     상세
    0.06
     kvůli
    0.06
    -moving
    0.06
    ีพ
    0.06
     hostel
    0.06
     дух
    0.06
    라마
    0.06
     длитель
    0.06
     Müslüman
    0.06
    Act Density 0.043%

    No Known Activations