INDEX
    Explanations

    words related to cleanliness, specifically related to washing

    references to the concept of washing or washrooms

    New Auto-Interp
    Negative Logits
    reme
    -0.80
    ourke
    -0.78
    */(
    -0.77
    iasm
    -0.75
     Defenders
    -0.69
    vernment
    -0.68
    auri
    -0.63
    izons
    -0.63
     appre
    -0.62
    rador
    -0.61
    POSITIVE LOGITS
     ashore
    1.09
    cloth
    0.98
    stakes
    0.95
     wash
    0.91
    aways
    0.91
    houses
    0.89
     washed
    0.84
    gate
    0.83
    robe
    0.83
    rooms
    0.81
    Act Density 0.005%

    No Known Activations