INDEX
    Explanations

    phrases indicating certainty or emphasis

    phrases asserting impossibility or the absence of a method

    New Auto-Interp
    Negative Logits
    ilts
    -0.81
    rongh
    -0.81
    asts
    -0.79
    eg
    -0.79
     livest
    -0.78
    igl
    -0.70
    aples
    -0.69
    origin
    -0.69
    lux
    -0.68
    itton
    -0.68
    POSITIVE LOGITS
     whatsoever
    0.92
     else
    0.77
     anybody
    0.77
     anyone
    0.72
    Reviewer
    0.72
     anymore
    0.70
    ouver
    0.66
     THEY
    0.66
    point
    0.64
     around
    0.62
    Act Density 0.036%

    No Known Activations