INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Brussels
    -0.07
    rát
    -0.07
     Pyongyang
    -0.07
     sang
    -0.06
     громад
    -0.06
    ropolis
    -0.06
     Ar
    -0.06
    erna
    -0.06
     mars
    -0.06
     Parish
    -0.06
    POSITIVE LOGITS
     detective
    0.16
     Detective
    0.16
     detectives
    0.15
     отверсти
    0.08
    UTF
    0.07
    0.06
    scient
    0.06
     Innovative
    0.06
    0.06
    listed
    0.06
    Act Density 0.003%

    No Known Activations