INDEX
    Explanations

    Interests and hobbies

    This neuron activates on words naming personal hobbies, interests, or leisure activities.

    New Auto-Interp
    Negative Logits
     Self
    -0.08
     poured
    -0.07
     NP
    -0.07
     Control
    -0.07
    Capture
    -0.07
     지원
    -0.07
    обрет
    -0.06
    _package
    -0.06
     работать
    -0.06
     Skinny
    -0.06
    POSITIVE LOGITS
    				      
    0.07
    					      
    0.06
    ;width
    0.06
    ():↵
    0.06
    :^
    0.06
    0.06
     correl
    0.06
    ('~
    0.06
    0.06
    "]),
    0.06
    Act Density 0.113%

    No Known Activations