INDEX
    Explanations

    mentions of the word "permission"

    New Auto-Interp
    Negative Logits
    ulative
    -0.73
    vae
    -0.70
    oche
    -0.68
    ulz
    -0.67
    arer
    -0.66
    itched
    -0.64
    famous
    -0.64
    ilde
    -0.61
     Revel
    -0.61
    tal
    -0.61
    POSITIVE LOGITS
     permission
    1.14
     permissions
    1.02
     waivers
    0.85
     granted
    0.83
     slips
    0.82
    Reviewer
    0.80
     clearance
    0.80
     authorizing
    0.79
    eous
    0.73
     confir
    0.73
    Act Density 0.024%

    No Known Activations