INDEX
    Explanations

    phrases related to permissions and restrictions

    New Auto-Interp
    Negative Logits
    å±
    -0.15
    Argb
    -0.15
    ibus
    -0.14
    itone
    -0.14
    rog
    -0.14
    ugg
    -0.14
     hollow
    -0.14
    uer
    -0.13
    OLA
    -0.13
     labore
    -0.13
    POSITIVE LOGITS
     Pru
    0.17
    rum
    0.17
    offee
    0.16
    mix
    0.16
    isher
    0.16
    olist
    0.15
    ipse
    0.15
    abal
    0.15
     Knot
    0.14
    undance
    0.14
    Act Density 0.047%

    No Known Activations