INDEX
    Explanations

    links or keywords related to websites or forums

    the presence of end-of-text markers

    New Auto-Interp
    Negative Logits
     destro
    -0.66
     behav
    -0.64
    akespe
    -0.63
    renheit
    -0.63
     Vaugh
    -0.63
     disadvant
    -0.62
    userc
    -0.62
     nodd
    -0.62
     toget
    -0.61
     colle
    -0.61
    POSITIVE LOGITS
     Shin
    0.74
     Advocate
    0.66
     Rock
    0.65
     Rise
    0.63
     Oversight
    0.63
     Release
    0.62
     Supporters
    0.60
     Shutdown
    0.60
     Theft
    0.60
     Ahmad
    0.60
    Act Density 0.764%

    No Known Activations