INDEX
    Explanations

    phrases related to consequences and regulations

    New Auto-Interp
    Negative Logits
    ibal
    -0.17
    itor
    -0.15
    ornings
    -0.15
    illac
    -0.14
    rouch
    -0.14
     Bey
    -0.14
    ltk
    -0.14
     Fet
    -0.14
    abit
    -0.14
    .za
    -0.14
    POSITIVE LOGITS
     itself
    0.16
    gere
    0.14
     meaning
    0.14
    appe
    0.14
    .Criteria
    0.14
     ÏħÏĢÏĮ
    0.13
    otas
    0.13
    _nsec
    0.13
    eÄį
    0.13
     "..
    0.13
    Act Density 0.204%

    No Known Activations