INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _Entry
    -0.07
     prohibiting
    -0.06
    Build
    -0.06
    losure
    -0.06
    Skip
    -0.06
    unidad
    -0.06
     punishing
    -0.06
     FC
    -0.06
    -orders
    -0.06
    INSTANCE
    -0.06
    POSITIVE LOGITS
    ливо
    0.06
     fatty
    0.06
     světa
    0.06
    rf
    0.06
     Fundamental
    0.06
    erap
    0.06
    ấp
    0.06
    ermann
    0.06
     professionalism
    0.06
     nắng
    0.06
    Act Density 0.053%

    No Known Activations