INDEX
    Explanations

    political or authoritative terms, often relating to power or control

    occurrences of the word "reign" and its variations

    New Auto-Interp
    Negative Logits
    ertodd
    -0.79
    Quotes
    -0.69
    hammad
    -0.68
    FFER
    -0.66
     Friendly
    -0.62
     Humanity
    -0.61
    ----------------
    -0.60
     contrace
    -0.59
     Spiel
    -0.58
    WER
    -0.57
    POSITIVE LOGITS
    pin
    0.85
    ited
    0.84
    ieved
    0.78
    unders
    0.77
    uin
    0.75
    der
    0.74
    s
    0.74
    esses
    0.74
    oct
    0.73
    iever
    0.71
    Act Density 0.011%

    No Known Activations