INDEX
    Explanations

    The neuron fires principally on non-English (foreign-language) tokens.

    New Auto-Interp
    Negative Logits
     bw
    -0.08
     ад
    -0.07
    .constructor
    -0.07
    ungeons
    -0.06
     тим
    -0.06
    -0.06
    -ce
    -0.06
     subdiv
    -0.06
     flam
    -0.06
     rnd
    -0.06
    POSITIVE LOGITS
    acer
    0.07
    nano
    0.07
     firm
    0.07
    says
    0.07
     fotograf
    0.06
     paylaş
    0.06
    -spec
    0.06
     Says
    0.06
     Cert
    0.06
     Curl
    0.06
    Act Density 0.294%

    No Known Activations