INDEX
    Explanations

    contexts where a significant change or increase occurs

    phrases that indicate significant changes or increases

    New Auto-Interp
    Negative Logits
    tein
    -0.89
    rity
    -0.76
    sburgh
    -0.76
    icip
    -0.71
    busters
    -0.69
    nan
    -0.68
    nar
    -0.67
    imir
    -0.67
    orno
    -0.67
    "}],"
    -0.65
    POSITIVE LOGITS
     effected
    0.76
     altering
    0.74
    ãĥ£
    0.72
     alter
    0.72
     proport
    0.69
     ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
    0.68
     changed
    0.68
    owered
    0.68
     impacting
    0.68
     alters
    0.68
    Act Density 0.015%

    No Known Activations