INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     vp
    -0.08
    posição
    -0.08
    .bill
    -0.07
     hil
    -0.07
    孩童
    -0.07
    -0.07
     Neg
    -0.07
    sold
    -0.06
    -0.06
    KL
    -0.06
    POSITIVE LOGITS
    0.07
    0.07
    ichen
    0.07
    thag
    0.07
    ibernate
    0.07
    -table
    0.07
    新模式
    0.07
    					       
    0.07
    ائن
    0.06
    _percentage
    0.06
    Act Density 0.016%

    No Known Activations