INDEX
    Explanations

    statements of criticism or commentary towards public figures or social issues

    New Auto-Interp
    Negative Logits
     unspeak
    -1.84
     intersper
    -1.83
     increa
    -1.83
     snoopy
    -1.80
     fta
    -1.79
     thut
    -1.78
     ftu
    -1.78
     tolerably
    -1.75
     gaily
    -1.74
     apprehen
    -1.72
    POSITIVE LOGITS
    FlatAppearance
    0.86
    IntoConstraints
    0.72
    NOPQRST
    0.71
    DataPropertyName
    0.70
    0.67
    Producción
    0.67
    Opere
    0.67
    FlatStyle
    0.67
    dymyr
    0.66
    imageio
    0.65
    Act Density 0.030%

    No Known Activations