INDEX
    Explanations

    This neuron fires on spans of direct speech or quoted statements (i.e. dialogue/“said”‐style quotes).

    New Auto-Interp
    Negative Logits
     Αν
    -0.06
     hlavně
    -0.06
    .Localization
    -0.06
     sociale
    -0.06
     vad
    -0.06
     επίσης
    -0.06
     Quarter
    -0.06
    .Copy
    -0.06
     minions
    -0.06
    -services
    -0.06
    POSITIVE LOGITS
     outcome
    0.07
    POSIT
    0.07
     Parm
    0.07
    ें↵
    0.07
    SENT
    0.06
     getP
    0.06
    하는
    0.06
    apply
    0.06
    league
    0.06
     tous
    0.06
    Act Density 0.034%

    No Known Activations