INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    leton
    -0.72
     jax
    -0.71
    -0.68
    เฉ
    -0.67
     NotFound
    -0.66
    スプレ
    -0.65
    auru
    -0.65
    Rainbow
    -0.63
    ại
    -0.63
    Dx
    -0.63
    POSITIVE LOGITS
     lower
    3.59
     lowercase
    3.55
     Lower
    3.09
    lowercase
    3.00
    Lower
    2.98
    lower
    2.98
    LOWER
    2.44
     lowers
    2.31
     LOWER
    2.30
    LowerCase
    2.23
    Act Density 0.110%

    No Known Activations