INDEX
    Explanations

    words related to rules, regulations, and guidelines

    concepts related to conformity and alignment with certain ideologies or perspectives

    New Auto-Interp
    Negative Logits
    pard
    -0.70
    aniel
    -0.64
    town
    -0.63
    mare
    -0.62
    istan
    -0.62
    patrick
    -0.62
     reluct
    -0.61
    antics
    -0.60
    ppa
    -0.59
    asking
    -0.59
    POSITIVE LOGITS
     neither
    0.83
     directly
    0.75
     measurable
    0.74
     specific
    0.72
     specifically
    0.72
     overlap
    0.72
     GMOs
    0.70
     antit
    0.70
     solely
    0.69
     nothing
    0.69
    Act Density 0.372%

    No Known Activations