INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     jokes
    -0.80
     Haskell
    -0.75
    jokes
    -0.67
    Nationalité
    -0.61
     puns
    -0.60
     Jokes
    -0.60
    NGC
    -0.58
    Jokes
    -0.58
    reportWebVitals
    -0.57
    DoubleQuotes
    -0.55
    POSITIVE LOGITS
    TagMode
    0.69
    +#+#
    0.61
     CreateTagHelper
    0.56
    s
    0.54
    ned
    0.52
    drav
    0.52
    rath
    0.49
    AVE
    0.48
    Autoritní
    0.48
     respeito
    0.46
    Act Density 0.389%

    No Known Activations