INDEX
    Explanations

    phrases indicating the act of revealing information or oneself

    instances of revealing information or disclosures related to personal or confidential matters

    New Auto-Interp
    Negative Logits
    Reviewed
    -0.74
    oslav
    -0.74
     recognizes
    -0.64
     emulate
    -0.64
    hesda
    -0.62
    chairs
    -0.62
     realize
    -0.60
     upkeep
    -0.60
    rollers
    -0.59
    eson
    -0.59
    POSITIVE LOGITS
     secrets
    0.97
     clues
    0.85
     mysteries
    0.80
     trove
    0.78
     WikiLeaks
    0.75
     vulnerabilities
    0.73
     whereabouts
    0.72
    İĭ
    0.72
     incrim
    0.72
     truths
    0.70
    Act Density 0.252%

    No Known Activations