INDEX
    Explanations

    This neuron detects words that express sincerity or genuine sentiment (e.g., “sincere,” “heartfelt,” “genuinely”).

    New Auto-Interp
    Negative Logits
    .'
    -0.07
     Ideally
    -0.06
     loggedIn
    -0.06
    Def
    -0.06
     ontology
    -0.06
     enumerated
    -0.06
     orderBy
    -0.06
     August
    -0.06
    .ev
    -0.06
     deaths
    -0.06
    POSITIVE LOGITS
     sincere
    0.14
     sincerely
    0.11
     sincerity
    0.09
    0.09
     sincer
    0.08
     Narc
    0.08
     bếp
    0.07
     heartfelt
    0.07
    0.07
    0.07
    Act Density 0.005%

    No Known Activations