INDEX
    Explanations

    occurrences of the word "man" in various contexts

    New Auto-Interp
    Negative Logits
    erties
    -0.17
    afen
    -0.15
    maal
    -0.15
    ted
    -0.15
    erte
    -0.15
    iran
    -0.15
    人çī©
    -0.15
    ayne
    -0.15
    $MESS
    -0.14
    istes
    -0.14
    POSITIVE LOGITS
    iac
    0.31
    hattan
    0.28
    ufac
    0.24
    hunt
    0.23
    hood
    0.23
    agment
    0.22
    agements
    0.22
    tras
    0.22
    opause
    0.21
    ifold
    0.21
    Act Density 0.071%

    No Known Activations