INDEX
    Explanations

    phrases expressing preference or opposition

    negations or phrases emphasizing the word "not."

    New Auto-Interp
    Negative Logits
    eur
    -0.80
    velt
    -0.72
    kamp
    -0.69
    ction
    -0.68
    itor
    -0.66
    ixel
    -0.65
    ç·
    -0.63
     Mehran
    -0.63
    riber
    -0.62
    lance
    -0.62
    POSITIVE LOGITS
     necessarily
    1.40
    icably
    1.20
    epad
    1.09
    icable
    0.98
    etheless
    0.97
    withstanding
    0.94
     bothering
    0.92
    orious
    0.88
     remotely
    0.85
    eworthy
    0.81
    Act Density 0.091%

    No Known Activations