INDEX
    Explanations

    emotional descriptors and adjectives related to various societal issues

    New Auto-Interp
    Negative Logits
     own
    -0.17
     Enhancement
    -0.16
     Own
    -0.15
    ka
    -0.14
    qual
    -0.14
     another
    -0.14
     modification
    -0.14
     a
    -0.14
    ÙĪØ±Ø§
    -0.14
    atch
    -0.14
    POSITIVE LOGITS
     nature
    0.45
    nature
    0.38
    ness
    0.27
     Nature
    0.26
    Nature
    0.25
     confines
    0.23
     aspect
    0.23
     aspects
    0.22
    æĢ§
    0.22
     majority
    0.21
    Act Density 0.653%

    No Known Activations