INDEX
    Explanations

    adjectives related to the evaluation of concepts or things

    phrases related to opinions or assessments of societal and cultural topics

    New Auto-Interp
    Negative Logits
    igent
    -0.77
     conclud
    -0.74
    steps
    -0.70
    ktop
    -0.70
    Materials
    -0.69
     underscores
    -0.69
     reiterate
    -0.68
    cellaneous
    -0.68
    edIn
    -0.68
    »Ĵ
    -0.67
    POSITIVE LOGITS
     ruining
    1.16
     gonna
    1.10
     racist
    1.09
     sexist
    1.04
     extinct
    1.04
     crap
    1.04
    nt
    1.02
     unbeat
    1.02
     BAD
    1.00
     horrible
    1.00
    Act Density 0.406%

    No Known Activations