INDEX
    Explanations

    instances of strong emotional responses or tensions in discussions

    chat messages and user tags

    This neuron detects turn boundary tokens—i.e., the end-of-turn / conversation boundary marker.

    New Auto-Interp
    Negative Logits
     Administrativna
    -0.73
     otomatig
    -0.70
    хьтан
    -0.64
    нгред
    -0.62
    uxxxx
    -0.62
    <unused41>
    -0.60
    ſſung
    -0.60
    <unused8>
    -0.60
    <unused3>
    -0.60
    <pad>
    -0.60
    POSITIVE LOGITS
    It
    0.44
    I
    0.42
     It
    0.41
    archiviato
    0.40
     it
    0.38
    There
    0.38
    He
    0.37
     pretty
    0.36
     I
    0.36
    That
    0.36
    Act Density 0.034%

    No Known Activations