INDEX
    Explanations

    political and ethical terms and concepts

    New Auto-Interp
    Negative Logits
    verbs
    -0.83
    ku
    -0.66
    door
    -0.63
    paren
    -0.62
    ramid
    -0.60
    ammy
    -0.60
    gap
    -0.59
    packed
    -0.59
    activation
    -0.59
    cell
    -0.59
    POSITIVE LOGITS
    ments
    0.89
    entimes
    0.89
    hower
    0.79
    ocument
    0.77
    é¾įåĸļ士
    0.76
    tainment
    0.72
    ĸļ
    0.71
    eenth
    0.70
    ufact
    0.70
    mares
    0.70
    Act Density 3.271%

    No Known Activations