INDEX
    Explanations

    words related to positive emotions or characteristics

    adjectives that describe various qualities or characteristics

    New Auto-Interp
    Negative Logits
    ajor
    -0.85
    Downloadha
    -0.71
     strengthened
    -0.67
     inaug
    -0.67
     corresponding
    -0.67
    authorized
    -0.65
    quart
    -0.63
    supported
    -0.63
    conservancy
    -0.63
    onto
    -0.63
    POSITIVE LOGITS
    ness
    1.25
    ly
    1.13
     Enough
    1.08
    nesses
    1.07
    NESS
    1.00
    est
    0.97
     enough
    0.95
    LY
    0.87
     Bastard
    0.83
    glers
    0.82
    Act Density 0.218%

    No Known Activations