INDEX
    Explanations

    Medical context

    New Auto-Interp
    Negative Logits
     mouths
    -1.23
     orale
    -1.02
     throats
    -0.98
     pleaſure
    -0.93
     oral
    -0.85
     Monfieur
    -0.85
     orally
    -0.84
     aloud
    -0.84
     knots
    -0.83
     Majefty
    -0.82
    POSITIVE LOGITS
     for
    0.62
     a
    0.58
     it
    0.58
     with
    0.57
     when
    0.56
     Z
    0.54
     Man
    0.54
     some
    0.52
     Men
    0.51
     Hem
    0.51
    Act Density 0.039%

    No Known Activations