INDEX
    Explanations

    phrases with personal pronouns followed by verbs

    references to a specific female subject

    New Auto-Interp
    Negative Logits
     Skydragon
    -0.73
    INGTON
    -0.69
    atory
    -0.68
    ornia
    -0.67
    assing
    -0.67
    kefeller
    -0.65
     shaping
    -0.61
    ~~~~
    -0.60
    ouver
    -0.59
    ilateral
    -0.59
    POSITIVE LOGITS
    pherd
    1.38
    pher
    1.31
    pard
    1.20
    ffield
    1.14
    athed
    1.13
    athing
    1.11
    ppard
    1.10
    ldon
    1.09
    lly
    0.96
    ikh
    0.95
    Act Density 0.088%

    No Known Activations