INDEX
    Explanations

    phrases related to spreading information, specifically encouraging others to share and promote content

    New Auto-Interp
    Negative Logits
    deen
    -0.80
    */(
    -0.73
     Zup
    -0.72
    clamation
    -0.69
    herty
    -0.66
    --+
    -0.65
    tarians
    -0.64
    venge
    -0.64
     Starship
    -0.62
    udeau
    -0.62
    POSITIVE LOGITS
    sheets
    1.84
    sheet
    1.38
     misinformation
    0.93
    shirt
    0.88
     disinformation
    0.82
     awareness
    0.82
     geographically
    0.80
     across
    0.80
     spreads
    0.80
    pread
    0.78
    Act Density 0.036%

    No Known Activations