INDEX
    Explanations

    mentions of actions involving the nose

    occurrences of the term "sn" likely referring to derogatory or negative slang terms

    New Auto-Interp
    Negative Logits
    heid
    -0.84
    xual
    -0.72
    EMENT
    -0.72
    WAYS
    -0.65
    shire
    -0.64
     PowerPoint
    -0.61
    minus
    -0.61
     lack
    -0.61
    mine
    -0.61
     Templar
    -0.58
    POSITIVE LOGITS
    obb
    1.17
    appers
    1.15
    agging
    1.15
    atching
    1.15
    agged
    1.14
    apper
    1.13
    atches
    1.12
    appy
    1.10
    oot
    1.09
    ips
    1.09
    Act Density 0.012%

    No Known Activations