INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    __));↵
    -0.08
     advocate
    -0.08
    	UPROPERTY
    -0.08
     Adoles
    -0.07
     ).
    -0.07
    Operand
    -0.07
     získal
    -0.07
    podob
    -0.07
     Vintage
    -0.07
     mer
    -0.07
    POSITIVE LOGITS
    icemail
    0.07
    ARGE
    0.06
    0.06
     kry
    0.06
     Iss
    0.06
    ipping
    0.06
    ayi
    0.06
    AGING
    0.06
    attack
    0.06
    (nx
    0.05
    Act Density 0.005%

    No Known Activations