INDEX
    Explanations

    references to "the" that signal significant aspects of context or commentary

    New Auto-Interp
    Negative Logits
     overall
    -0.17
     situation
    -0.16
    apolis
    -0.15
     Overall
    -0.15
     nature
    -0.15
     degree
    -0.15
     Uncomment
    -0.15
    overall
    -0.15
     idea
    -0.14
    xic
    -0.14
    POSITIVE LOGITS
     sudden
    0.19
     available
    0.18
     different
    0.17
     goodness
    0.17
     talk
    0.17
     necessary
    0.17
    /all
    0.17
     rage
    0.17
    owing
    0.17
    uded
    0.17
    Act Density 0.096%

    No Known Activations