INDEX
    Explanations

    The neuron selectively activates on the “Lie” action tags (the “(Lie …)” markers) in the dialogue.

    New Auto-Interp
    Negative Logits
     parsed
    -0.08
    _Internal
    -0.07
     Terrace
    -0.07
    clared
    -0.07
     Schwe
    -0.07
    Martin
    -0.07
     quel
    -0.06
     roomId
    -0.06
    715
    -0.06
    worked
    -0.06
    POSITIVE LOGITS
    DataSource
    0.06
    _aff
    0.06
     lobbyists
    0.06
     RPG
    0.06
    ưỡng
    0.06
     Living
    0.06
     Initialize
    0.06
     δεν
    0.06
    ENSE
    0.06
    [data
    0.06
    Act Density 0.004%

    No Known Activations