INDEX
    Explanations

    phrases associated with instructions or guidance

    New Auto-Interp
    Negative Logits
     sk
    -0.70
     ste
    -0.65
     bet
    -0.63
     p
    -0.63
     inv
    -0.63
    -
    -0.62
     sti
    -0.61
     re
    -0.61
    -0.60
     ri
    -0.60
    POSITIVE LOGITS
     myſelf
    1.53
     himſelf
    1.50
     itſelf
    1.48
     auffi
    1.44
     ſeveral
    1.42
    ſelf
    1.42
     themſelves
    1.41
     ainfi
    1.39
     againſt
    1.38
     ſhe
    1.37
    Act Density 0.164%

    No Known Activations