INDEX
    Explanations

    phrases indicating manipulation, subversion, and the triggering of unrest or dissent

    New Auto-Interp
    Negative Logits
    Tikang
    -0.76
     ModelExpression
    -0.62
    httphttps
    -0.49
    HomeAsUpEnabled
    -0.48
     AttributeSet
    -0.48
    WriteTagHelper
    -0.47
    انيف
    -0.46
    /**
    -0.45
     Exacts
    -0.45
    KURZBESCHREIBUNG
    -0.45
    POSITIVE LOGITS
     estratégico
    0.49
    ThemeOverlay
    0.46
     tactic
    0.45
     psicológica
    0.45
    わざ
    0.41
     ligiloj
    0.41
    あえて
    0.40
    PAD
    0.39
     psicológico
    0.39
     purposely
    0.38
    Act Density 0.829%

    No Known Activations