INDEX
    Explanations

    references to drinking alcohol and its consequences

    New Auto-Interp
    Negative Logits
     
    -1.04
    -​
    -0.93
    ​—
    -0.80
     itſelf
    -0.80
     Shakspeare
    -0.78
     Etats
    -0.78
    tvguidetime
    -0.77
     Arki
    -0.77
     Hift
    -0.75
     pleaſure
    -0.75
    POSITIVE LOGITS
     ...
    2.79
     …
    2.34
     ..
    1.76
     ..."
    1.73
     ....
    1.70
     ...,
    1.62
     ...)
    1.61
     ...'
    1.54
     ·
    1.49
     ...
    
    1.47
    Act Density 0.186%

    No Known Activations