INDEX
    Explanations

    This neuron fires on discourse markers that introduce or enumerate new types or categories (e.g. “Another,” “type,” “various,” “In”).

    New Auto-Interp
    Negative Logits
    (redis
    -0.07
    _already
    -0.07
     тепер
    -0.06
     yerine
    -0.06
    oor
    -0.06
    irler
    -0.06
    	queue
    -0.06
    Pad
    -0.06
     pq
    -0.06
    ilere
    -0.06
    POSITIVE LOGITS
    H
    0.07
    >--}}↵
    0.06
    ED
    0.06
    -tier
    0.06
    uggest
    0.06
    _Class
    0.06
     Sandy
    0.06
    JT
    0.06
    Coffee
    0.06
     '"';↵
    0.06
    Act Density 0.039%

    No Known Activations