INDEX
    Explanations

    positive change or improvement

    New Auto-Interp
    Negative Logits
    volving
    0.38
     ספר
    0.37
    itabbo
    0.36
    lhe
    0.35
    Ф
    0.35
     Classification
    0.35
     designations
    0.34
    িনী
    0.33
     указыва
    0.33
    ូប
    0.33
    POSITIVE LOGITS
     mejorar
    0.79
     improve
    0.77
     mejora
    0.76
     melhorar
    0.72
     migliorare
    0.72
     improves
    0.70
     cải
    0.70
     amélior
    0.68
     améliorer
    0.68
     улуч
    0.67
    Act Density 0.115%

    No Known Activations