INDEX
    Explanations

    words related to mental states such as agitation, confusion, and paranoia

    words or phrases associated with various forms of "ness," indicating quality or state

    New Auto-Interp
    Negative Logits
    ODE
    -0.71
     Auschwitz
    -0.65
    bern
    -0.64
    verbs
    -0.64
    ORN
    -0.62
    ask
    -0.62
    WAR
    -0.62
    ellar
    -0.61
    amen
    -0.60
    veh
    -0.60
    POSITIVE LOGITS
    iness
    1.16
    terness
    0.97
    ness
    0.93
    yy
    0.87
    nesses
    0.85
    ionage
    0.82
    ĪĴ
    0.81
    hip
    0.79
    yk
    0.79
    edly
    0.73
    Act Density 0.027%

    No Known Activations