INDEX
    Explanations

    phrases indicating high levels of concern or urgency

    New Auto-Interp
    Negative Logits
    ниÑĩ
    -0.15
     itself
    -0.15
     cl
    -0.15
    erties
    -0.14
    ensible
    -0.14
    ãģĹãģ®
    -0.14
    ällt
    -0.14
    wer
    -0.14
    ivery
    -0.14
    olet
    -0.14
    POSITIVE LOGITS
     Breed
    0.16
    urator
    0.16
     Dodd
    0.16
     Rip
    0.15
    еÑı
    0.15
    uda
    0.14
    odata
    0.14
    kad
    0.14
    ÅĻe
    0.14
    ason
    0.14
    Act Density 0.046%

    No Known Activations