INDEX
    Explanations

    mentions of something being "backed" or supported

    words related to physical actions or states involving "up" or "down."

    New Auto-Interp
    Negative Logits
    beit
    -0.65
     retri
    -0.61
     warr
    -0.61
     carriage
    -0.60
    scape
    -0.59
    PF
    -0.59
     trave
    -0.58
     Sm
    -0.58
     Siber
    -0.57
     ancest
    -0.55
    POSITIVE LOGITS
    olicy
    1.20
    ublic
    0.93
    rison
    0.87
    inion
    0.83
    odcast
    0.82
    dates
    0.80
    utics
    0.79
    osition
    0.78
    pping
    0.77
    onent
    0.76
    Act Density 0.018%

    No Known Activations