INDEX
    Explanations

    numerical quantities followed by units of measurement

    occurrences of the word "approximately."

    New Auto-Interp
    Negative Logits
    woods
    -0.78
    ters
    -0.77
    era
    -0.76
    ny
    -0.73
    lined
    -0.72
    ieu
    -0.70
    ned
    -0.68
    neys
    -0.66
    jit
    -0.66
    enders
    -0.65
    POSITIVE LOGITS
     200
    0.81
     400
    0.76
     3000
    0.75
     550
    0.75
     9000
    0.75
     820
    0.75
     175
    0.75
     âĸĪ
    0.74
     800
    0.74
     250
    0.74
    Act Density 0.017%

    No Known Activations