INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ali
    -0.07
    -yyyy
    -0.07
     injector
    -0.07
     Interstate
    -0.07
    .off
    -0.07
     medidas
    -0.06
     diseases
    -0.06
    _ping
    -0.06
     temples
    -0.06
     lig
    -0.06
    POSITIVE LOGITS
     astronauts
    0.12
     astronaut
    0.11
    onaut
    0.07
     rien
    0.06
    _REF
    0.06
     composing
    0.06
    Brazil
    0.06
    工作
    0.06
    rou
    0.06
     Watches
    0.06
    Act Density 0.003%

    No Known Activations