INDEX
    Explanations

    instances of humor or funny content

    New Auto-Interp
    Negative Logits
     Dessa
    -0.80
     myſelf
    -0.78
     Paragu
    -0.76
     Majefty
    -0.75
     Chriftian
    -0.75
     Pisa
    -0.75
     Atwood
    -0.74
    ASTIC
    -0.73
     ſtate
    -0.72
    CodedInputStream
    -0.71
    POSITIVE LOGITS
    er
    1.20
    erà
    0.92
     FileName
    0.86
    FileName
    0.84
    GrantedAuthority
    0.81
     Muir
    0.77
     Brenner
    0.72
    linec
    0.72
    scaron
    0.71
    ρίου
    0.71
    Act Density 0.173%

    No Known Activations