INDEX
    Explanations

    words that indicate a strong impact or significant change

    New Auto-Interp
    Negative Logits
    aspers
    -0.15
    /GPL
    -0.15
     Baxter
    -0.15
    enger
    -0.15
    ãĥĩãĥ«
    -0.14
     absol
    -0.14
    bach
    -0.14
    pps
    -0.14
    bos
    -0.14
    .tp
    -0.14
    POSITIVE LOGITS
    æıIJé«ĺ
    0.18
    æĿ
    0.18
     different
    0.18
     improve
    0.17
     improves
    0.17
    atta
    0.16
    oup
    0.16
     improved
    0.16
     Improve
    0.16
    oupon
    0.16
    Act Density 0.047%

    No Known Activations