INDEX
    Explanations

    phrases related to providing information or instructions to the user

    references to the reader or listener

    New Auto-Interp
    Negative Logits
    ftime
    -0.79
     Sabha
    -0.75
     Fried
    -0.66
    Tang
    -0.66
    ĸļ
    -0.64
     Ange
    -0.64
     Advent
    -0.62
     Course
    -0.62
     Agriculture
    -0.62
    adelphia
    -0.61
    POSITIVE LOGITS
     guys
    1.03
    're
    1.02
    tub
    0.96
     know
    0.89
    RS
    0.87
     naughty
    0.78
    've
    0.76
     glimpse
    0.74
     decide
    0.72
     understand
    0.71
    Act Density 0.079%

    No Known Activations