INDEX
    Explanations

    mentions of conspiracy and political manipulation

    "of" followed by "the"

    New Auto-Interp
    Negative Logits
    sidemargin
    -0.84
    SuppressMessage
    -0.76
    المكان
    -0.71
    Gizmos
    -0.67
    oneofs
    -0.65
     ligiloj
    -0.65
    RTLR
    -0.65
    DotNetBar
    -0.64
    脚注の使い方
    -0.64
     dwie
    -0.64
    POSITIVE LOGITS
     what
    0.73
     us
    0.57
     it
    0.56
    ValueStyle
    0.53
     these
    0.52
    enumii
    0.49
     this
    0.49
    Ditto
    0.47
     them
    0.47
    unknownFields
    0.47
    Act Density 0.128%

    No Known Activations