INDEX
    Explanations

    phrases emphasizing key points or arguments

    New Auto-Interp
    Negative Logits
    orie
    -0.15
    ippy
    -0.15
    ischer
    -0.15
     Days
    -0.14
    ipl
    -0.14
     soon
    -0.14
     Soon
    -0.14
    esco
    -0.13
    ycin
    -0.13
    azed
    -0.13
    POSITIVE LOGITS
     precisely
    0.17
    ubern
    0.16
    hev
    0.16
    utow
    0.16
    dez
    0.15
    eyn
    0.15
     именно
    0.15
     itself
    0.15
    ceb
    0.15
    eden
    0.14
    Act Density 0.156%

    No Known Activations