INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ########.
    -0.65
    :✨
    -0.60
    -0.54
     allowances
    -0.53
    LOST
    -0.52
     toppers
    -0.51
    Geplaatst
    -0.51
     Introducing
    -0.50
     consultations
    -0.49
    consultation
    -0.49
    POSITIVE LOGITS
     hair
    1.20
    hair
    1.05
     HAIR
    1.03
     Hair
    1.02
    Hair
    0.97
    HAIR
    0.82
     hairs
    0.80
     rambut
    0.70
     cheveux
    0.69
     compressed
    0.69
    Act Density 0.203%

    No Known Activations