INDEX
    Explanations

    good or bad descriptions

    New Auto-Interp
    Negative Logits
    joint
    -0.09
     Pek
    -0.09
     ener
    -0.09
    artisan
    -0.09
     Caller
    -0.09
    erty
    -0.08
    aison
    -0.08
     Neville
    -0.08
     Cousins
    -0.08
    uct
    -0.08
    POSITIVE LOGITS
     person
    0.21
     citizen
    0.18
     citizens
    0.15
     listener
    0.14
     human
    0.14
     Person
    0.13
     listeners
    0.13
     friend
    0.13
    cit
    0.12
     daughter
    0.12
    Act Density 0.079%

    No Known Activations