INDEX
    Explanations

    words related to moral values and principles

    plural nouns or adjectives and words related to groups or categories

    New Auto-Interp
    Negative Logits
    PET
    -0.79
    amba
    -0.70
    Tex
    -0.66
    INO
    -0.66
    UL
    -0.63
    UCK
    -0.62
    window
    -0.61
    LECT
    -0.60
    ATED
    -0.59
    Sharp
    -0.59
    POSITIVE LOGITS
    gemony
    0.90
    ashtra
    0.80
    rahim
    0.78
    apego
    0.78
    ndra
    0.73
    brids
    0.73
    ths
    0.72
    ilege
    0.71
    zsche
    0.70
    andum
    0.70
    Act Density 0.221%

    No Known Activations