INDEX
    Explanations

    phrases related to control or authority

    phrases emphasizing the concept of allowing or permitting actions or behaviors

    New Auto-Interp
    Negative Logits
    oppable
    -0.83
     Languages
    -0.72
     holiest
    -0.69
    lihood
    -0.67
    atana
    -0.67
     millenn
    -0.66
    agher
    -0.66
    querque
    -0.63
     cumbers
    -0.63
    edly
    -0.63
    POSITIVE LOGITS
    tered
    1.12
    tering
    0.92
     slip
    0.87
    icia
    0.86
    enne
    0.75
     loose
    0.75
     us
    0.73
    itia
    0.69
     go
    0.67
     me
    0.67
    Act Density 0.028%

    No Known Activations