INDEX
    Explanations

    female pronouns

    references to authors and their interpretations

    New Auto-Interp
    Negative Logits
    gettable
    -0.83
    VERTISEMENT
    -0.80
     srfAttach
    -0.77
     Leban
    -0.77
     Ranked
    -0.69
     neighb
    -0.68
    neath
    -0.67
     Decay
    -0.66
     Leilan
    -0.66
     Goo
    -0.65
    POSITIVE LOGITS
     misunderstand
    1.12
     misinterpret
    1.08
     miscon
    1.04
     quote
    1.03
     misrepresent
    1.03
     phr
    1.02
     misunderstood
    1.02
     quoting
    0.98
     correctly
    0.97
     exagger
    0.94
    Act Density 0.588%

    No Known Activations