INDEX
    Explanations

    concepts related to ethical dilemmas and the value of life

    New Auto-Interp
    Negative Logits
     Lent
    -0.16
    achat
    -0.15
     iT
    -0.15
     Rating
    -0.14
    eming
    -0.14
    .ribbon
    -0.14
    visualization
    -0.14
    ल
    -0.13
     Elsa
    -0.13
     questioning
    -0.13
    POSITIVE LOGITS
     Raw
    0.29
    Raw
    0.26
    _raw
    0.19
     norm
    0.18
     Thick
    0.18
    norm
    0.17
     morally
    0.17
     duties
    0.17
    .Raw
    0.17
    Minimal
    0.17
    Act Density 0.066%

    No Known Activations