INDEX
    Explanations

    phrases or sentences indicating permission or enabling of actions

    instances of the word "allow" and its variations

    New Auto-Interp
    Negative Logits
    borough
    -0.80
    xon
    -0.72
    bard
    -0.69
    bon
    -0.68
    enegger
    -0.68
    need
    -0.67
    kaya
    -0.66
    nard
    -0.63
    leaf
    -0.62
    bons
    -0.61
    POSITIVE LOGITS
     us
    0.83
    Reviewer
    0.82
     me
    0.73
    ipient
    0.69
     passers
    0.67
     them
    0.67
     him
    0.67
     rapists
    0.66
    ANCE
    0.66
    ances
    0.66
    Act Density 0.063%

    No Known Activations