INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    bas
    -0.07
    -0.07
     calmly
    -0.07
     Na
    -0.07
     singles
    -0.07
    |--------------------------------------------------------------------------↵
    -0.06
    departments
    -0.06
    اضر
    -0.06
     dar
    -0.06
     tame
    -0.06
    POSITIVE LOGITS
    <|start_header_id|>
    0.07
    -INF
    0.06
    .getBoolean
    0.06
    /output
    0.06
    δοση
    0.06
    RARY
    0.06
    จะได
    0.06
     avons
    0.06
     инт
    0.06
     incorporation
    0.06
    Act Density 0.002%

    No Known Activations