INDEX
    Explanations

    praise or positive reactions in casual conversations

    expressions that convey a sense of comparison or similarity

    New Auto-Interp
    Negative Logits
    ourn
    -0.83
    irements
    -0.76
    Published
    -0.75
    Returns
    -0.71
    ependence
    -0.70
    icators
    -0.69
    ourse
    -0.68
    isition
    -0.67
    acia
    -0.67
    ribution
    -0.66
    POSITIVE LOGITS
    liest
    1.14
    lihood
    1.05
    lier
    0.93
     wow
    0.92
     oh
    0.81
     crazy
    0.81
    hhh
    0.77
    ooo
    0.75
     idiots
    0.74
     crap
    0.74
    Act Density 0.059%

    No Known Activations