INDEX
    Explanations

    occurrences of the word "man" and its derivatives in various contexts

    New Auto-Interp
    Negative Logits
    ts
    -0.17
    alars
    -0.16
    gor
    -0.15
    imler
    -0.14
    inz
    -0.14
    ÏĢη
    -0.14
    icket
    -0.14
    elm
    -0.14
     Larson
    -0.14
    ügen
    -0.14
    POSITIVE LOGITS
    agements
    0.28
    hattan
    0.26
    tras
    0.26
    iscal
    0.25
    agment
    0.23
    ifest
    0.23
    handled
    0.22
    ifold
    0.22
    fred
    0.22
    uka
    0.22
    Act Density 0.028%

    No Known Activations