INDEX
    Explanations

    terms related to strong emotional reactions and specific biological processes

    topics related to health, social issues, and environmental concerns

    New Auto-Interp
    Negative Logits
     ?)
    -0.59
    with
    -0.52
    ivating
    -0.52
    lance
    -0.51
    odied
    -0.50
    ?),
    -0.50
    encia
    -0.47
    ?)
    -0.47
    ZA
    -0.47
    feat
    -0.47
    POSITIVE LOGITS
    .</
    0.81
    .:
    0.77
    .�
    0.74
    .#
    0.73
    .<
    0.69
    .''
    0.69
    .'
    0.69
    .*
    0.67
    .","
    0.67
    %.
    0.63
    Act Density 0.890%

    No Known Activations