INDEX
    Explanations

    words related to permission or prohibition

    phrases related to permission or restrictions on actions

    New Auto-Interp
    Negative Logits
    leaf
    -0.71
    lves
    -0.70
    fish
    -0.69
    atl
    -0.68
    xon
    -0.68
    oslav
    -0.63
    tal
    -0.62
     Generation
    -0.62
    blow
    -0.60
    bush
    -0.59
    POSITIVE LOGITS
    Reviewer
    1.01
     exemptions
    0.79
     plur
    0.73
     deviations
    0.72
     crawl
    0.71
    uthor
    0.68
    usa
    0.67
    pez
    0.66
     downtime
    0.66
    pedia
    0.66
    Act Density 0.045%

    No Known Activations