INDEX
    Explanations

    language indicating focus on particular topics or questions

    phrases introducing or referencing significant questions or topics

    New Auto-Interp
    Negative Logits
    ahime
    -0.70
    aq
    -0.70
    ugu
    -0.69
    iola
    -0.62
    hent
    -0.61
    pling
    -0.60
    say
    -0.59
    aga
    -0.59
    usk
    -0.58
    onday
    -0.57
    POSITIVE LOGITS
     occurs
    1.07
     deserves
    1.05
     arises
    1.04
     begs
    1.02
     haun
    1.01
     ought
    1.01
     awaits
    1.00
     arose
    1.00
     happens
    0.99
     hasn
    0.97
    Act Density 0.166%

    No Known Activations