INDEX
    Explanations

    This neuron fires on the first content word at the start of a free-form paragraph, i.e. paragraph or section openings.

    New Auto-Interp
    Negative Logits
    Separ
    -0.07
    -host
    -0.07
    erve
    -0.06
    _weak
    -0.06
    imu
    -0.06
    idity
    -0.06
    erving
    -0.06
    rew
    -0.06
    rition
    -0.06
     wcs
    -0.06
    POSITIVE LOGITS
    0.07
     Tot
    0.07
     Unsure
    0.07
    ै.↵
    0.06
     Lyme
    0.06
     spielen
    0.06
    中的
    0.06
     Τα
    0.06
     крок
    0.06
     путем
    0.06
    Act Density 0.163%

    No Known Activations