INDEX
    Explanations

    references to conspiracies and conspiracy-related terms

    starting with "co" or "-" prefixed words

    New Auto-Interp
    Negative Logits
     autorytatywna
    -0.84
    出版年
    -0.82
    -0.78
     ModelExpression
    -0.76
    IVEREF
    -0.74
    كويكب
    -0.72
    GEBURTSDATUM
    -0.71
    -0.70
    ftagPool
    -0.68
    enegger
    -0.68
    POSITIVE LOGITS
     co
    0.75
    /*
    0.56
     authored
    0.51
    co
    0.49
    ا
    0.47
    author
    0.45
     cosp
    0.44
     ortak
    0.44
    sign
    0.43
     orta
    0.42
    Act Density 0.065%

    No Known Activations