INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .grad
    -0.08
    ayar
    -0.06
     Tribe
    -0.06
     proletariat
    -0.06
    -summary
    -0.06
     acre
    -0.06
     explo
    -0.06
     Cer
    -0.06
     overlap
    -0.06
    .LinearLayout
    -0.06
    POSITIVE LOGITS
     femme
    0.07
     nghiệp
    0.06
     lég
    0.06
    keeping
    0.06
    引用频次
    0.06
    ":[-
    0.06
    alfa
    0.06
    vine
    0.06
     również
    0.06
     máme
    0.06
    Act Density 0.005%

    No Known Activations