INDEX
    Explanations

    finding similarities

    The neuron fires on descriptive feature‐oriented words—terms like “elements,” “gameplay,” “style,” “atmosphere,” and similar nouns that describe attributes of films or games.

    New Auto-Interp
    Negative Logits
    mouseout
    -0.07
    uk
    -0.06
     Dept
    -0.06
    .one
    -0.06
     conte
    -0.06
     topLeft
    -0.06
     Sachs
    -0.06
     عبارت
    -0.06
    fortunately
    -0.06
     ques
    -0.06
    POSITIVE LOGITS
    \F
    0.07
     savory
    0.07
     anxious
    0.06
    0.06
    ovy
    0.06
     نفر
    0.06
     hük
    0.06
    	RTLU
    0.06
    _dbg
    0.06
    0.06
    Act Density 0.017%

    No Known Activations