INDEX
    Explanations

    references to the word "man"

    New Auto-Interp
    Negative Logits
    ted
    -0.18
    tin
    -0.17
    ga
    -0.17
    genic
    -0.17
    gate
    -0.17
    onaut
    -0.17
    iesen
    -0.17
    gen
    -0.16
    go
    -0.16
    gie
    -0.15
    POSITIVE LOGITS
    iac
    0.31
    hattan
    0.28
    agements
    0.25
    alysis
    0.24
    agers
    0.23
    ifold
    0.23
    UEL
    0.23
    ufact
    0.23
    agment
    0.22
    ifest
    0.22
    Act Density 0.062%

    No Known Activations