INDEX
    Explanations

    phrases indicating clear conclusions or evaluations

    instances of the word "clearly" emphasizing transparency or obviousness in statements

    New Auto-Interp
    Negative Logits
    uese
    -0.79
    aily
    -0.76
    oleon
    -0.70
    anish
    -0.68
    umption
    -0.68
    hell
    -0.68
    awaru
    -0.68
    lav
    -0.67
    urch
    -0.67
    rost
    -0.67
    POSITIVE LOGITS
     deline
    0.97
     marked
    0.84
     identifiable
    0.82
     distinguish
    0.80
     differentiated
    0.78
     readable
    0.75
     articulated
    0.74
     outwe
    0.74
     differentiate
    0.74
     spelled
    0.73
    Act Density 0.027%

    No Known Activations