INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     magazine
    -0.09
     matérias
    -0.08
     Polo
    -0.08
     underside
    -0.08
     Corinth
    -0.08
     Alliance
    -0.08
     chão
    -0.08
     bevestigd
    -0.08
     Roof
    -0.07
     roof
    -0.07
    POSITIVE LOGITS
     privacy
    0.14
    Privacy
    0.12
     privacidad
    0.11
     Privacy
    0.11
    privacy
    0.10
     anonym
    0.10
     개인정보
    0.10
     safeguarding
    0.09
     anonim
    0.09
    _sensitive
    0.09
    Act Density 0.005%

    No Known Activations