INDEX
    Explanations

    phrases related to urging or advising others to do or not do something

    negative imperatives or expressions indicating prohibition

    New Auto-Interp
    Negative Logits
    catentry
    -0.67
    Judge
    -0.66
    Fourth
    -0.65
    rouse
    -0.60
    Analysis
    -0.60
    utherford
    -0.60
    ItemThumbnailImage
    -0.59
    Higher
    -0.58
    ullah
    -0.58
    METHOD
    -0.57
    POSITIVE LOGITS
    ï¸ı
    0.73
     kidding
    0.72
    =""
    0.69
     âĢº
    0.64
     ðŁij
    0.64
     (@
    0.63
    ymes
    0.63
    ĸļ
    0.62
     ðŁ
    0.60
     FANTASY
    0.60
    Act Density 0.645%

    No Known Activations