INDEX
    Explanations

    phrases related to health conditions or treatments

    New Auto-Interp
    Negative Logits
     ſtate
    -1.25
     itſelf
    -1.20
     Shakspeare
    -1.16
     Houſe
    -1.15
     myſelf
    -1.15
     iſt
    -1.14
     themſelves
    -1.12
     Diſ
    -1.12
     Monfieur
    -1.12
     uſe
    -1.11
    POSITIVE LOGITS
    ↵↵
    0.70
    ?
    0.63
    ,
    0.63
    ↵↵↵
    0.59
    .
    0.57
    :
    0.57
     «
    0.56
    <eos>
    0.55
    !
    0.55
     A
    0.55
    Act Density 0.033%

    No Known Activations