INDEX
    Explanations

    instances of the word "only."

    New Auto-Interp
    Negative Logits
    #End
    -0.19
    tright
    -0.16
    etter
    -0.16
    usz
    -0.15
    iling
    -0.15
    UILD
    -0.15
    ingly
    -0.15
     rac
    -0.15
    esy
    -0.15
    ishly
    -0.14
    POSITIVE LOGITS
    endor
    0.16
    /or
    0.16
    eparator
    0.15
     th
    0.14
    osh
    0.14
    ewood
    0.14
    yaw
    0.14
    apur
    0.14
    isd
    0.13
    horia
    0.13
    Act Density 0.012%

    No Known Activations