INDEX
    Explanations

    The neuron activates on words and word pieces referring to sequels, spin-offs, or follow-up productions.

    New Auto-Interp
    Negative Logits
     invers
    -0.07
     Επι
    -0.07
     симв
    -0.06
     القرن
    -0.06
     whoever
    -0.06
     inducing
    -0.06
    FirstChild
    -0.06
     detecting
    -0.06
     broaden
    -0.06
    ética
    -0.06
    POSITIVE LOGITS
    ł
    0.07
    anical
    0.07
     Makeup
    0.06
    0.06
     deviceId
    0.06
    -Col
    0.06
     bm
    0.06
    0.06
     telesc
    0.06
    alus
    0.06
    Act Density 0.027%

    No Known Activations