INDEX
    Explanations

    phrases related to comparisons or choices between different options

    repeated references to the word "the" and its context within phrases

    New Auto-Interp
    Negative Logits
    maxwell
    -0.74
    assetsadobe
    -0.72
    owicz
    -0.71
    lance
    -0.71
    govtrack
    -0.69
    ledge
    -0.67
    lie
    -0.64
    Allows
    -0.64
    ovation
    -0.64
    REC
    -0.63
    POSITIVE LOGITS
     aforementioned
    1.00
     sexes
    0.85
     facets
    0.85
     foregoing
    0.84
    ses
    0.81
     factions
    0.80
     scenarios
    0.80
     options
    0.80
     evils
    0.79
     extremes
    0.79
    Act Density 0.146%

    No Known Activations