INDEX
    Explanations

    phrases indicating significant improvements or advancements

    New Auto-Interp
    Negative Logits
    essee
    -0.88
    Interstitial
    -0.76
    anyahu
    -0.71
    entary
    -0.66
    urally
    -0.65
     Beware
    -0.62
    zza
    -0.60
     Sins
    -0.60
    XM
    -0.60
    gel
    -0.60
    POSITIVE LOGITS
    frog
    1.23
     leaps
    0.98
    olate
    0.85
     forward
    0.80
    hemer
    0.79
    roads
    0.78
    Forward
    0.76
    ering
    0.75
    olicy
    0.74
    arts
    0.74
    Act Density 0.017%

    No Known Activations