INDEX
    Explanations

    prepositions followed by abstract concepts or actions

    prepositions and words indicating relationships or specifics about a topic

    New Auto-Interp
    Negative Logits
    quartered
    -0.81
    olitical
    -0.81
    quet
    -0.79
    ynthesis
    -0.76
    ires
    -0.73
    acy
    -0.69
    ashington
    -0.68
    ired
    -0.67
    ensable
    -0.67
    eteenth
    -0.67
    POSITIVE LOGITS
     somet
    0.85
     this
    0.83
     myself
    0.83
     figuring
    0.82
     yours
    0.81
     ya
    0.81
     twitter
    0.81
     yourselves
    0.80
     THAT
    0.78
     it
    0.77
    Act Density 0.543%

    No Known Activations