INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     людям
    -0.07
    ultural
    -0.06
    .converter
    -0.06
    avana
    -0.06
     impulses
    -0.06
     `(
    -0.06
     flows
    -0.06
     Goals
    -0.06
     Colleges
    -0.06
    -0.06
    POSITIVE LOGITS
     incarcer
    0.07
     đọ
    0.06
     tracer
    0.06
    creator
    0.06
    =temp
    0.06
     چشم
    0.06
    _counters
    0.06
     باشگاه
    0.06
    xce
    0.06
     окруж
    0.06
    Act Density 0.002%

    No Known Activations