INDEX
    Explanations

    gratitude and acknowledgment expressions

    phrases emphasizing the word "first" in various contexts

    New Auto-Interp
    Negative Logits
    etheless
    -1.00
    cffff
    -0.72
    sung
    -0.71
    atten
    -0.68
    laugh
    -0.68
    nes
    -0.68
    sports
    -0.68
    owl
    -0.66
    cel
    -0.65
    far
    -0.65
    POSITIVE LOGITS
     introdu
    1.03
    asma
    0.76
    ODUCT
    0.76
     assume
    0.67
     introduction
    0.67
     premise
    0.65
     assumption
    0.64
     Explain
    0.64
     disclaimer
    0.64
    ASY
    0.63
    Act Density 0.294%

    No Known Activations