INDEX
    Explanations

    phrases related to different forms of concepts

    phrases indicating different kinds of forms or categories

    New Auto-Interp
    Negative Logits
    urers
    -0.87
     Zup
    -0.79
    ween
    -0.73
    doms
    -0.70
    iets
    -0.70
     Cosponsors
    -0.70
     teasp
    -0.70
    ostics
    -0.70
    nets
    -0.67
    omers
    -0.67
    POSITIVE LOGITS
     harassment
    0.84
     accommodation
    0.81
    thood
    0.81
     activism
    0.79
     inspiration
    0.78
     discrimination
    0.76
     insanity
    0.74
     dementia
    0.74
     taxation
    0.71
     humor
    0.71
    Act Density 0.060%

    No Known Activations