INDEX
    Explanations

    explaining how things operate or are structured

    New Auto-Interp
    Negative Logits
     awful
    0.39
     horrible
    0.38
     terrible
    0.33
     hurting
    0.31
     неуда
    0.30
     fainting
    0.30
    失败
    0.30
     dreadful
    0.30
     losers
    0.29
     crappy
    0.29
    POSITIVE LOGITS
     overseen
    0.45
     headquartered
    0.43
     underpinned
    0.43
     actively
    0.39
     governed
    0.39
     operates
    0.36
     supplemented
    0.36
     intricately
    0.36
     routinely
    0.35
     wholly
    0.35
    Act Density 0.000%

    No Known Activations