INDEX
    Explanations

    instances where someone is being criticized or blamed for something

    occurrences of the word "for" in various contexts

    New Auto-Interp
    Negative Logits
    edin
    -0.87
    LAB
    -0.81
    fleet
    -0.77
    OTAL
    -0.77
    along
    -0.76
    atl
    -0.74
     awaits
    -0.74
    nin
    -0.74
    oct
    -0.72
    hess
    -0.72
    POSITIVE LOGITS
     centuries
    0.94
    geries
    0.94
    gotten
    0.93
    gery
    0.92
     example
    0.90
     daring
    0.89
     inaction
    0.88
     decades
    0.85
    bidden
    0.84
     awhile
    0.83
    Act Density 0.142%

    No Known Activations