INDEX
    Explanations

    phrases where individuals introduce themselves by stating their name

    phrases that introduce a name

    New Auto-Interp
    Negative Logits
    aunders
    -0.73
     guiActiveUnfocused
    -0.70
    yrinth
    -0.69
    nuts
    -0.66
    Js
    -0.65
    iership
    -0.64
    erker
    -0.63
     receptive
    -0.63
    istas
    -0.61
    EMS
    -0.61
    POSITIVE LOGITS
    plates
    1.35
    plate
    1.30
     tags
    0.91
     tag
    0.91
    ames
    0.88
    checked
    0.87
     redacted
    0.86
    checks
    0.86
    paces
    0.85
    akes
    0.83
    Act Density 0.037%

    No Known Activations