INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    效果
    -0.09
     chụp
    -0.07
     teens
    -0.07
     kl
    -0.07
     melts
    -0.07
    5
    -0.07
    _RECORD
    -0.07
    14
    -0.06
     feels
    -0.06
     roof
    -0.06
    POSITIVE LOGITS
     authorised
    0.15
     authorized
    0.14
     authorization
    0.11
     unauthorized
    0.10
     Authorized
    0.09
    authorized
    0.09
     authorize
    0.09
    Authorized
    0.09
     Authorization
    0.08
    Unauthorized
    0.08
    Act Density 0.008%

    No Known Activations