INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ring
    -0.08
     kt
    -0.07
     cấu
    -0.07
     jobject
    -0.07
     copper
    -0.07
     존재
    -0.07
    .moveToFirst
    -0.07
    ="#">↵
    -0.07
     süt
    -0.07
    utt
    -0.07
    POSITIVE LOGITS
     expenses
    0.09
     Expenses
    0.08
    سة
    0.07
     Lisa
    0.07
     expense
    0.07
    Lisa
    0.07
     Romance
    0.07
    министра
    0.07
    Despite
    0.07
    0.07
    Act Density 0.005%

    No Known Activations