INDEX
    Explanations

    negations or words indicating the absence of something

    New Auto-Interp
    Negative Logits
     же
    -0.16
    گرد
    -0.16
    berger
    -0.16
    ingu
    -0.15
     nor
    -0.15
    hores
    -0.15
    hoe
    -0.15
    ickerView
    -0.15
    eenth
    -0.14
    /***/
    -0.14
    POSITIVE LOGITS
    ori
    0.25
     surprisingly
    0.21
     only
    0.21
     everyone
    0.21
     knowing
    0.20
     everything
    0.20
     sure
    0.19
     having
    0.19
    tingham
    0.19
     least
    0.18
    Act Density 0.065%

    No Known Activations