INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    elta
    -0.82
    ieved
    -0.74
    epend
    -0.74
    vez
    -0.73
    enfranch
    -0.72
    pez
    -0.70
    opter
    -0.70
    reath
    -0.69
    rients
    -0.67
    emis
    -0.67
    POSITIVE LOGITS
     anyone
    0.79
     you
    0.79
     anybody
    0.75
     somebody
    0.69
     someone
    0.68
     ever
    0.67
     guests
    0.67
     someday
    0.67
     they
    0.64
     objections
    0.64
    Act Density 0.059%

    No Known Activations