INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     assistive
    0.46
     very
    0.43
    非常
    0.42
     outset
    0.41
     telehealth
    0.40
    \
    0.39
     including
    0.39
     neutralization
    0.39
     timeframe
    0.38
     desorption
    0.38
    POSITIVE LOGITS
    値段
    0.44
     médiocrement
    0.43
     झग
    0.41
     کھانا
    0.41
     цене
    0.41
     комфор
    0.40
     вечером
    0.40
     médioc
    0.40
     membeli
    0.40
     покупать
    0.40
    Act Density 0.035%

    No Known Activations