INDEX
    Explanations

    disinformation and misinformation

    New Auto-Interp
    Negative Logits
    北海道
    0.54
    сного
    0.53
    家族
    0.51
     homozyg
    0.51
    льного
    0.51
     recorrido
    0.50
    ገል
    0.50
    त्ति
    0.48
    0.47
     pregnancies
    0.47
    POSITIVE LOGITS
     disinformation
    0.73
     you
    0.64
    filtering
    0.64
     Propaganda
    0.61
     misinformation
    0.61
    we
    0.59
     propaganda
    0.59
     your
    0.58
     countermeasures
    0.58
     Markt
    0.57
    Act Density 0.093%

    No Known Activations