INDEX
    Explanations

    names of individuals, particularly doctors and people mentioned in the text

    New Auto-Interp
    Negative Logits
     increa
    -2.13
     affor
    -2.10
     desir
    -2.09
     inev
    -2.06
     volunte
    -2.04
     unden
    -1.99
     guarante
    -1.99
     fuf
    -1.98
     emphat
    -1.98
     purcha
    -1.97
    POSITIVE LOGITS
    .
    1.15
    ;
    0.97
    0.93
    ).
    0.88
    ,
    0.88
    ."
    0.83
    !
    0.82
    :
    0.81
    .)
    0.81
    );
    0.81
    Act Density 0.201%

    No Known Activations