INDEX
    Explanations

    instances of scandal or misconduct involving public figures

    New Auto-Interp
    Negative Logits
     ÑĦеÑĢ
    -0.15
    laÄį
    -0.15
    ppo
    -0.14
    ıc
    -0.14
    ardware
    -0.14
    çµIJå©ļ
    -0.14
    ãĥ¬ãĥ¼
    -0.14
    ToMany
    -0.14
    ëĥ
    -0.14
    Hardware
    -0.13
    POSITIVE LOGITS
     escort
    0.40
     escorts
    0.39
     Escort
    0.35
    escort
    0.35
    Escort
    0.34
     Escorts
    0.30
     broth
    0.27
    Esc
    0.27
     call
    0.26
     clients
    0.25
    Act Density 0.016%

    No Known Activations