INDEX
    Explanations

    the word "did" in various contexts

    New Auto-Interp
    Negative Logits
     houſe
    -0.80
     pleaſure
    -0.77
     ſtate
    -0.75
     Houſe
    -0.69
    ſelves
    -0.68
     purpoſe
    -0.64
     ſtre
    -0.63
    NameInMap
    -0.60
     ſche
    -0.59
     ſte
    -0.57
    POSITIVE LOGITS
    did
    0.80
     did
    0.77
     Did
    0.77
    Did
    0.69
     was
    0.66
     DID
    0.64
    DID
    0.58
     had
    0.57
    Twas
    0.56
     gave
    0.55
    Act Density 0.055%

    No Known Activations