INDEX
    Explanations

    words related to alcohol consumption and its consequences, specifically focusing on instances of drunkenness

    references to intoxication, particularly related to being drunk

    New Auto-Interp
    Negative Logits
    Downloadha
    -0.88
     JPM
    -0.78
     Flavoring
    -0.75
    akeru
    -0.72
    ILA
    -0.71
    DonaldTrump
    -0.70
    isite
    -0.70
    adr
    -0.68
    metics
    -0.67
    ocol
    -0.66
    POSITIVE LOGITS
     drunk
    1.00
    ards
    0.95
    bott
    0.93
     drinking
    0.88
     underage
    0.84
    cohol
    0.83
    ness
    0.82
     manslaughter
    0.80
     alcohol
    0.79
     binge
    0.79
    Act Density 0.029%

    No Known Activations