INDEX
    Explanations

    words related to cleanliness, organization, or positive attributes

    phrases related to cleanliness and orderliness

    New Auto-Interp
    Negative Logits
    CVE
    -0.71
    asions
    -0.68
    Downloadha
    -0.68
    oan
    -0.66
    GA
    -0.66
    ioxide
    -0.66
    7601
    -0.66
    ativity
    -0.66
     Defenders
    -0.65
    lection
    -0.64
    POSITIVE LOGITS
     neat
    1.06
    ness
    0.99
    nesses
    0.97
     tidy
    0.91
     tid
    0.85
    icles
    0.82
    ilde
    0.77
    liness
    0.75
     contra
    0.72
     little
    0.71
    Act Density 0.008%

    No Known Activations