INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     share
    -0.08
     HOW
    -0.07
     overflowing
    -0.07
    ombie
    -0.07
    er
    -0.07
     impression
    -0.07
    .FIELD
    -0.07
     time
    -0.07
    or
    -0.07
    Raw
    -0.07
    POSITIVE LOGITS
     Its
    0.15
     its
    0.15
    Its
    0.12
     ITS
    0.11
    ITS
    0.10
     His
    0.08
     It
    0.08
    TS
    0.07
    ーツ
    0.07
     Stevens
    0.07
    Act Density 0.103%

    No Known Activations