INDEX
    Explanations

    phrases indicating urgency or a critical situation

    New Auto-Interp
    Negative Logits
    éļª
    -0.15
    ì»
    -0.15
    224
    -0.15
    imple
    -0.15
    peek
    -0.14
    engkap
    -0.14
    ãĥĥãĥĪ
    -0.14
    meler
    -0.14
    нÑĮ
    -0.13
    TextWriter
    -0.13
    POSITIVE LOGITS
     DEST
    0.28
     absolutely
    0.28
     obl
    0.27
     sm
    0.25
     smoked
    0.24
     torch
    0.24
     dec
    0.24
     tear
    0.23
     destroy
    0.23
    destroy
    0.23
    Act Density 0.333%

    No Known Activations