INDEX
    Explanations

    negative phrases or words that indicate disapproval or caution

    New Auto-Interp
    Negative Logits
     Vind
    -0.61
    tagext
    -0.56
    ViewFeatures
    -0.55
     proposte
    -0.54
     Varela
    -0.53
    rp
    -0.51
     VIAF
    -0.51
    Accessors
    -0.51
    Leif
    -0.50
    ìm
    -0.49
    POSITIVE LOGITS
    should
    1.20
     Should
    1.19
     should
    1.13
     SHOULD
    1.11
    Should
    1.10
    hould
    1.04
     shouldn
    0.97
     shouldBe
    0.97
     shouldnt
    0.95
     Shouldn
    0.93
    Act Density 0.082%

    No Known Activations