INDEX
    Explanations

    phrases related to significant historical events and their impacts

    New Auto-Interp
    Negative Logits
     studied
    -0.50
     visited
    -0.49
     Studied
    -0.47
    studied
    -0.46
     watched
    -0.45
     Observed
    -0.43
    visited
    -0.43
     researched
    -0.42
     observed
    -0.42
    watched
    -0.41
    POSITIVE LOGITS
     led
    1.59
     enabled
    1.49
     helped
    1.48
     prompted
    1.39
     caused
    1.36
    helped
    1.31
     allowed
    1.28
     brought
    1.27
     prevented
    1.22
     spurred
    1.20
    Act Density 1.535%

    No Known Activations