INDEX
    Explanations

    the word "man" in various contexts

    New Auto-Interp
    Negative Logits
     Clay
    -0.14
    stral
    -0.14
    ensitive
    -0.14
     Bbw
    -0.14
    Sensitive
    -0.14
     Horny
    -0.14
    amous
    -0.14
    اÙģØª
    -0.14
    wort
    -0.14
    rig
    -0.14
    POSITIVE LOGITS
    غاÙĨ
    0.19
     simul
    0.17
    ooke
    0.15
    agh
    0.15
    -Clause
    0.15
    à¥ģà¤ļ
    0.15
    .jobs
    0.14
    Builder
    0.14
    omi
    0.14
    ilyn
    0.14
    Act Density 0.007%

    No Known Activations