INDEX
    Explanations

    specific pronouns followed by verbs

    New Auto-Interp
    Negative Logits
    emale
    -0.69
    luster
    -0.66
     topp
    -0.66
    Interstitial
    -0.64
     smashing
    -0.64
     exting
    -0.62
     moderation
    -0.62
     scra
    -0.62
     slashing
    -0.61
     theme
    -0.61
    POSITIVE LOGITS
     know
    1.36
    know
    1.32
     KNOW
    1.32
    Know
    1.28
     knows
    1.22
    knowledge
    1.20
     knew
    1.19
     Know
    1.14
     knowing
    1.09
    Knowing
    1.03
    Act Density 0.367%

    No Known Activations