INDEX
    Explanations

    instances of the word "down" in various contexts

    New Auto-Interp
    Negative Logits
     Mits
    -0.18
    tle
    -0.17
    ipar
    -0.17
    orarily
    -0.16
    uer
    -0.16
    d
    -0.16
    to
    -0.15
    dz
    -0.15
    orate
    -0.15
     Runner
    -0.14
    POSITIVE LOGITS
    pour
    0.25
    patrick
    0.24
    ey
    0.24
    graded
    0.23
    grading
    0.22
    playing
    0.22
     syndrome
    0.21
    grades
    0.21
     Syndrome
    0.21
    shift
    0.21
    Act Density 0.019%

    No Known Activations