INDEX
    Explanations

    mentions of the word "humans" in the text

    New Auto-Interp
    Negative Logits
     Crom
    -0.71
    ounter
    -0.66
    sbm
    -0.64
    pton
    -0.63
    tie
    -0.63
     exclusive
    -0.62
    orama
    -0.61
     Style
    -0.61
     magazine
    -0.61
    pin
    -0.61
    POSITIVE LOGITS
     humans
    3.59
     Humans
    2.81
    humans
    2.52
     human
    2.11
     humankind
    2.10
     mortals
    2.00
     mammals
    1.91
     primates
    1.84
    human
    1.83
     humanity
    1.82
    Act Density 0.020%

    No Known Activations