INDEX
    Explanations

    phrases related to physical actions or descriptions

    pronouns related to people

    New Auto-Interp
    Negative Logits
     Gerard
    -0.69
     Mit
    -0.67
     Majority
    -0.66
     Emer
    -0.65
     Cliff
    -0.64
     Pompe
    -0.64
     Beacon
    -0.64
     Junk
    -0.64
     Anat
    -0.63
     Petraeus
    -0.63
    POSITIVE LOGITS
    arers
    0.98
    pton
    0.91
    ÃĥÃĤ
    0.91
    ared
    0.90
    've
    0.90
    arer
    0.89
    til
    0.88
    'll
    0.86
    hots
    0.85
    ldon
    0.84
    Act Density 0.305%

    No Known Activations