INDEX
    Explanations

    explicit mentions of rules or restrictions being imposed on specific behaviors, actions, or groups

    phrases related to permissions and prohibitions

    New Auto-Interp
    Negative Logits
     Soldier
    -0.70
    xon
    -0.67
    lves
    -0.65
    lust
    -0.62
    posure
    -0.62
    borough
    -0.61
    athan
    -0.61
     Generation
    -0.61
    center
    -0.60
    sis
    -0.59
    POSITIVE LOGITS
    Reviewer
    1.09
    uthor
    0.87
     exemptions
    0.79
    ommod
    0.76
    ľ
    0.74
     allowed
    0.72
     disclaim
    0.70
    ptin
    0.70
     permitted
    0.70
    ravel
    0.69
    Act Density 0.038%

    No Known Activations