INDEX
    Explanations

    mentions of people and their perspectives or actions in various contexts

    "Many [group of people]"

    New Auto-Interp
    Negative Logits
     anything
    -0.79
     anyone
    -0.73
    anyone
    -0.70
    Anything
    -0.70
     any
    -0.70
    anything
    -0.69
     Anything
    -0.68
     mostly
    -0.68
     Anyone
    -0.67
     Cualquier
    -0.67
    POSITIVE LOGITS
     including
    0.76
     either
    0.73
    including
    0.72
     Either
    0.70
    INCLUDING
    0.68
    either
    0.67
    Either
    0.64
     Including
    0.64
     choose
    0.63
    甚至是
    0.62
    Act Density 0.344%

    No Known Activations