INDEX
    Explanations

    references to specific locations or entities followed by an action or state

    occurrences of the word "the" in various contexts

    New Auto-Interp
    Negative Logits
    thood
    -0.87
    eed
    -0.81
     because
    -0.78
    Ò
    -0.76
     besides
    -0.74
    verage
    -0.72
    plete
    -0.72
    leeve
    -0.71
    tsy
    -0.69
    ago
    -0.69
    POSITIVE LOGITS
     slightest
    1.03
     biggest
    0.98
     entire
    0.96
     majority
    0.95
     simplest
    0.95
     entirety
    0.95
     temptation
    0.94
     easiest
    0.91
     greatest
    0.91
     likelihood
    0.90
    Act Density 0.279%

    No Known Activations