INDEX
    Explanations

    роф. I am choosing this because there is a repetition of this token in MAX_ACTIVATING_TOKENS

    New Auto-Interp
    Negative Logits
     charms
    -0.09
     cardstock
    -0.08
     charm
    -0.08
     codecs
    -0.08
     volt
    -0.08
     Libert
    -0.08
     χει
    -0.08
     zak
    -0.07
     airborne
    -0.07
     ballistic
    -0.07
    POSITIVE LOGITS
    very
    0.07
     compartment
    0.07
    -Val
    0.07
    healthy
    0.07
    ef
    0.07
    407
    0.07
    esp
    0.07
    don
    0.07
    quinas
    0.07
    .Full
    0.07
    Act Density 0.001%

    No Known Activations