INDEX
    Explanations

    phrases related to taking risks or danger to life

    New Auto-Interp
    Negative Logits
    iaux
    -0.19
    ìŀĶ
    -0.15
    lesc
    -0.15
    .utilities
    -0.15
    ulong
    -0.15
    perature
    -0.15
    ÙĨØ´
    -0.14
    orate
    -0.14
    aliz
    -0.14
    uros
    -0.14
    POSITIVE LOGITS
     mot
    0.17
     bic
    0.16
    kola
    0.15
     Fet
    0.15
     Risk
    0.14
    ummer
    0.14
    ToDevice
    0.14
    lessly
    0.14
     risk
    0.14
    ily
    0.14
    Act Density 0.019%

    No Known Activations