INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     petroleum
    -0.07
    Enabled
    -0.06
     are
    -0.06
     invariant
    -0.06
     iy
    -0.06
    -green
    -0.06
     sexuality
    -0.06
    initialized
    -0.05
     fills
    -0.05
    izu
    -0.05
    POSITIVE LOGITS
    rott
    0.07
     ไทย
    0.07
    。(
    0.07
     [@
    0.07
     Thames
    0.07
    @Controller
    0.07
    .pick
    0.06
     voiced
    0.06
    VRTX
    0.06
    (Message
    0.06
    Act Density 0.072%

    No Known Activations