INDEX
    Explanations

    phrases related to permission or freedom to act as one desires

    expressions of desire or permission

    New Auto-Interp
    Negative Logits
    ricks
    -0.73
    ynski
    -0.72
     Berk
    -0.67
    riot
    -0.64
    errors
    -0.62
    bug
    -0.60
    riots
    -0.58
    utenant
    -0.58
     enthusi
    -0.58
    hero
    -0.57
    POSITIVE LOGITS
     to
    0.71
    urities
    0.69
    GB
    0.65
    edIn
    0.63
     mate
    0.63
     quotas
    0.61
    awaru
    0.61
    htar
    0.60
    uld
    0.60
    atra
    0.58
    Act Density 0.078%

    No Known Activations