INDEX
    Explanations

    references to specific names, particularly those of people involved in the entertainment industry

    New Auto-Interp
    Negative Logits
     Berm
    -0.16
    кÑĸв
    -0.15
     medal
    -0.15
     demon
    -0.14
    lex
    -0.14
    antan
    -0.14
     firing
    -0.14
    fm
    -0.14
    _IOS
    -0.13
    ainted
    -0.13
    POSITIVE LOGITS
     Stephen
    0.16
    LAG
    0.16
     Steve
    0.15
    ãĥ³ãĥĶ
    0.15
    νομ
    0.15
    INDOW
    0.15
    á»ķi
    0.15
     Gazette
    0.15
    ooke
    0.14
    -sensitive
    0.14
    Act Density 0.022%

    No Known Activations