INDEX
    Explanations

    references to bars and drinking establishments

    New Auto-Interp
    Negative Logits
    ska
    -0.16
    erness
    -0.15
    _reporting
    -0.14
    CALE
    -0.14
    arkin
    -0.14
    omic
    -0.14
    nal
    -0.14
    exus
    -0.14
    erring
    -0.14
    es
    -0.14
    POSITIVE LOGITS
    riere
    0.18
    ucene
    0.16
    ucu
    0.16
    agg
    0.16
    ر
    0.15
    oque
    0.15
    bers
    0.15
    resi
    0.15
    AGMA
    0.15
    ivec
    0.15
    Act Density 0.037%

    No Known Activations