INDEX
    Explanations

    negation or expressions of disapproval

    New Auto-Interp
    Negative Logits
    yntaxException
    -0.44
    exitRule
    -0.44
     Marín
    -0.42
     itſelf
    -0.42
    marathon
    -0.42
     betweenstory
    -0.40
     Bingham
    -0.40
    LookAnd
    -0.40
    mphony
    -0.40
     asynchronously
    -0.38
    POSITIVE LOGITS
    #+#
    0.48
    IsContent
    0.46
     بيها
    0.42
    (!__
    0.42
     chance
    0.41
    0.41
    setopt
    0.40
    ButtonModule
    0.40
     дописавши
    0.39
    NameInMap
    0.39
    Act Density 0.009%

    No Known Activations