INDEX
    Explanations

    instances of expressions related to humor

    New Auto-Interp
    Negative Logits
    onne
    -0.20
    atar
    -0.18
    ipo
    -0.18
    ATAR
    -0.17
    ottle
    -0.16
    uft
    -0.15
     hazard
    -0.14
    eyi
    -0.14
    erver
    -0.14
    æ´¾
    -0.14
    POSITIVE LOGITS
     cong
    0.15
     spots
    0.14
    avings
    0.14
    WND
    0.14
    .Interop
    0.13
    lenen
    0.13
     |_
    0.13
    uning
    0.13
    392
    0.13
    iming
    0.13
    Act Density 0.000%

    No Known Activations