INDEX
    Explanations

    references to the word "dirty" and its variations in various contexts

    New Auto-Interp
    Negative Logits
    alo
    -0.17
    گاÙĩ
    -0.16
    een
    -0.16
    uro
    -0.15
    ote
    -0.15
    Interop
    -0.15
    hod
    -0.15
    JI
    -0.15
    nett
    -0.15
    ONT
    -0.15
    POSITIVE LOGITS
     dirty
    0.22
     Dirty
    0.21
     little
    0.20
     laundry
    0.20
    dirty
    0.19
    Dirty
    0.18
     tricks
    0.17
    ymb
    0.17
    -minded
    0.17
     deeds
    0.17
    Act Density 0.009%

    No Known Activations