INDEX
    Explanations

    phrases indicating importance, value, or focus

    phrases introducing significant concepts or statements

    New Auto-Interp
    Negative Logits
    robe
    -0.64
    say
    -0.63
    aciously
    -0.63
    order
    -0.62
    ahime
    -0.62
    hip
    -0.62
    mission
    -0.61
    pointer
    -0.61
    erton
    -0.61
    due
    -0.59
    POSITIVE LOGITS
     bothers
    1.32
     distinguishes
    1.23
     happens
    1.23
     separates
    1.20
     happened
    1.16
     mattered
    1.13
     bothered
    1.09
     hurts
    1.00
     pops
    1.00
     sticks
    1.00
    Act Density 0.137%

    No Known Activations