INDEX
    Explanations

    phrases related to positive attributes or qualities like well-regarded, well-deserved, and well-resourced

    words related to regulation and governance

    New Auto-Interp
    Negative Logits
    knife
    -0.68
    mania
    -0.63
    agents
    -0.62
    pora
    -0.61
     Decoder
    -0.60
    magic
    -0.60
    Discussion
    -0.60
    wolves
    -0.60
     Jackets
    -0.60
    terms
    -0.58
    POSITIVE LOGITS
    ented
    1.09
    ited
    1.07
    ivated
    1.06
    oured
    1.02
    ated
    1.02
    enged
    1.02
    arded
    1.01
    ested
    1.00
    ioned
    1.00
    ured
    0.98
    Act Density 0.172%

    No Known Activations