INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    fila
    -0.06
     smells
    -0.06
    -0.06
    _secondary
    -0.06
    anoi
    -0.06
    _artist
    -0.06
    ovali
    -0.06
    pressor
    -0.05
    serialization
    -0.05
    buy
    -0.05
    POSITIVE LOGITS
    	define
    0.07
    мерикан
    0.07
     DEFAULT
    0.06
     Credit
    0.06
     running
    0.06
     Photograph
    0.06
     Jean
    0.06
     derail
    0.06
    打开
    0.06
    _CATEGORY
    0.06
    Act Density 0.090%

    No Known Activations