INDEX
    Explanations

    The neuron activates on words expressing emotional or physical closeness (e.g. “close,” “closeness”) indicating an intimate bond between characters.

    New Auto-Interp
    Negative Logits
     free
    -0.06
     несп
    -0.06
    Tmp
    -0.06
     shore
    -0.06
     Haut
    -0.06
     cortisol
    -0.06
     aime
    -0.05
    "](
    -0.05
     insults
    -0.05
     deix
    -0.05
    POSITIVE LOGITS
    zew
    0.07
     ROT
    0.07
    STATIC
    0.06
     pens
    0.06
    imming
    0.06
     skys
    0.06
     Teams
    0.06
    학년
    0.06
     verge
    0.06
    HOME
    0.06
    Act Density 0.012%

    No Known Activations