INDEX
    Explanations

    instances where the speaker expresses their personal thoughts or opinions

    instances of the word "that."

    New Auto-Interp
    Negative Logits
    aukee
    -0.82
    ãĥīãĥ©
    -0.79
    aunder
    -0.78
    ciples
    -0.76
    iped
    -0.74
    ãĥ©ãĥ³
    -0.68
    IELD
    -0.68
    earable
    -0.67
    ipers
    -0.67
    cakes
    -0.67
    POSITIVE LOGITS
     although
    0.82
     contradicts
    0.81
     happens
    0.77
     applies
    0.75
     translates
    0.75
     happened
    0.73
     justifies
    0.73
     relates
    0.73
     sounds
    0.72
     whoever
    0.70
    Act Density 0.247%

    No Known Activations