INDEX
    Explanations

    hypothetical statements

    the neuron activates on reporting or speculative verbs (e.g. “say,” “imply,” “would”) that introduce author commentary or hypothetical statements.

    New Auto-Interp
    Negative Logits
     offer
    -0.08
    idges
    -0.07
    BERT
    -0.07
    _refs
    -0.06
    Lisa
    -0.06
     down
    -0.06
     Blood
    -0.06
     Scott
    -0.06
    _Callback
    -0.06
    Logic
    -0.06
    POSITIVE LOGITS
     PropertyValue
    0.07
     neger
    0.06
    0.06
    .Schedule
    0.06
    (Default
    0.06
    ливі
    0.06
    .Server
    0.06
    
    0.06
    ",@"
    0.06
     определить
    0.06
    Act Density 0.028%

    No Known Activations