INDEX
    Explanations

    mention of the word "the" in various contexts

    New Auto-Interp
    Negative Logits
    deals
    -0.70
    ettings
    -0.67
    peak
    -0.65
    puff
    -0.65
    ben
    -0.63
    hooting
    -0.63
    duction
    -0.61
    oshi
    -0.60
    lang
    -0.59
    ward
    -0.59
    POSITIVE LOGITS
     opportunity
    1.22
     same
    1.14
     slightest
    1.11
     utmost
    1.08
     privilege
    0.98
     requisite
    0.97
     idea
    0.96
     courage
    0.95
     guts
    0.93
     brunt
    0.91
    Act Density 0.054%

    No Known Activations