INDEX
    Explanations

    phrases related to accountability and the consequences of actions

    New Auto-Interp
    Negative Logits
    ite
    -0.15
     Inject
    -0.14
    ifestyle
    -0.14
    itar
    -0.14
    ikel
    -0.14
    .persist
    -0.14
    ifest
    -0.13
     therein
    -0.13
    iples
    -0.13
     envis
    -0.13
    POSITIVE LOGITS
     people
    0.28
     stories
    0.24
     someone
    0.22
     stuff
    0.22
     companies
    0.21
     things
    0.21
    people
    0.20
     wars
    0.20
     countries
    0.19
     diseases
    0.19
    Act Density 0.906%

    No Known Activations