INDEX
    Explanations

    phrases related to societal or political power dynamics

    references to entities or groups identified by the suffix "s"

    New Auto-Interp
    Negative Logits
    bear
    -0.72
    esp
    -0.70
    Tap
    -0.69
    tags
    -0.68
    CLAIM
    -0.67
    Å¡
    -0.66
    fish
    -0.66
    tk
    -0.65
     Rothschild
    -0.64
    horse
    -0.63
    POSITIVE LOGITS
     own
    0.96
    selves
    0.95
     plight
    0.79
    terday
    0.78
    pecially
    0.78
    etheless
    0.76
    senal
    0.76
     successor
    0.75
     whereabouts
    0.73
    olution
    0.71
    Act Density 0.171%

    No Known Activations