INDEX
    Explanations

    phrases indicating contrasting viewpoints

    phrases contrasting with societal expectations or norms

    New Auto-Interp
    Negative Logits
    kamp
    -0.75
    velt
    -0.72
    stru
    -0.69
    age
    -0.68
    ixel
    -0.67
    pires
    -0.65
    FAQ
    -0.64
    eur
    -0.62
    aspers
    -0.62
    Age
    -0.61
    POSITIVE LOGITS
     necessarily
    1.50
     bothering
    1.00
    epad
    0.95
    withstanding
    0.95
    icably
    0.93
    eworthy
    0.88
     unlike
    0.84
     merely
    0.82
    ifying
    0.80
    ific
    0.78
    Act Density 0.068%

    No Known Activations