INDEX
    Explanations

    references to discussions around social issues and inclusivity

    New Auto-Interp
    Negative Logits
     المعيارى
    -0.75
    хьтан
    -0.67
     שוליים
    -0.65
    EndGlobalSection
    -0.63
    SourceChecksum
    -0.63
     idéia
    -0.61
     AssemblyCulture
    -0.61
     jedoch
    -0.59
     resourceCulture
    -0.57
    romyal
    -0.57
    POSITIVE LOGITS
     fucking
    0.85
     goddamn
    0.81
     fucked
    0.81
     fucks
    0.79
    fucking
    0.78
     fuck
    0.75
     FUCKING
    0.74
     plau
    0.74
    fuck
    0.73
     fuckin
    0.70
    Act Density 0.615%

    No Known Activations