INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Portman
    -0.53
     Bie
    -0.49
     oleju
    -0.47
    #
    -0.46
    beto
    -0.45
     Woodruff
    -0.45
    ekš
    -0.44
     Goodyear
    -0.44
     Bulb
    -0.43
    globo
    -0.43
    POSITIVE LOGITS
     cat
    1.99
     feline
    1.92
     cats
    1.91
     Cat
    1.82
     Cats
    1.80
    Cat
    1.76
     CAT
    1.72
    Cats
    1.63
    cat
    1.61
     kitty
    1.61
    Act Density 0.202%

    No Known Activations