INDEX
    Explanations

    the presence of the word "zer" in various contexts

    New Auto-Interp
    Negative Logits
    ership
    -0.97
    ers
    -0.83
    erest
    -0.75
    raising
    -0.74
    anooga
    -0.68
    rast
    -0.66
    luent
    -0.65
    ivity
    -0.65
    ivities
    -0.65
    ifice
    -0.63
    POSITIVE LOGITS
    zer
    0.80
    geist
    0.79
    abwe
    0.79
    vous
    0.75
    otonin
    0.74
    imbabwe
    0.70
    è¦ļéĨĴ
    0.69
    ploy
    0.69
    ãĤº
    0.69
    ppelin
    0.69
    Act Density 0.017%

    No Known Activations