INDEX
    Explanations

    words related to alcoholic beverages and parties

    New Auto-Interp
    Negative Logits
    iversal
    -0.86
    aido
    -0.82
    DAY
    -0.81
    nesota
    -0.80
    orate
    -0.78
    İĭ
    -0.78
    emade
    -0.77
    orable
    -0.77
    omo
    -0.77
     srf
    -0.76
    POSITIVE LOGITS
    aux
    1.14
    lli
    1.08
    llo
    0.95
    lla
    0.93
    ux
    0.80
    bourg
    0.79
    urs
    0.78
    agne
    0.78
     du
    0.78
     Hollande
    0.76
    Act Density 0.009%

    No Known Activations