INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     У
    -0.08
    ircles
    -0.08
    angelo
    -0.07
     superv
    -0.07
    -workers
    -0.07
    Ljava
    -0.07
    ánh
    -0.07
    -0.07
     Lawn
    -0.07
     twelve
    -0.07
    POSITIVE LOGITS
     bit
    0.16
     Bit
    0.08
     ^
    0.07
     BIT
    0.07
    /bit
    0.07
    -bit
    0.07
     sort
    0.07
    Sorry
    0.06
     bio
    0.06
     BMI
    0.06
    Act Density 0.018%

    No Known Activations