INDEX
    Explanations

    phrases expressing uncertainty about actions or knowledge

    New Auto-Interp
    Negative Logits
     ―――――
    -1.20
     itſelf
    -1.20
     Efq
    -1.19
     pleaſure
    -1.18
     ſind
    -1.15
     ſeveral
    -1.15
     Majefty
    -1.15
     Anſ
    -1.14
     Jefus
    -1.12
     Monfieur
    -1.12
    POSITIVE LOGITS
    .
    0.66
     a
    0.65
     (
    0.65
    0.61
    ,
    0.60
     in
    0.57
     and
    0.56
     I
    0.55
     of
    0.54
     -
    0.53
    Act Density 0.126%

    No Known Activations