INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _EXPRESSION
    -0.07
    Pal
    -0.06
    -president
    -0.06
     Госп
    -0.06
    -0.06
     pastry
    -0.06
     intimacy
    -0.06
     puzzled
    -0.06
     PERFORMANCE
    -0.06
     clue
    -0.06
    POSITIVE LOGITS
    0.07
    _MAG
    0.07
     nons
    0.06
    /";↵
    0.06
    َك
    0.06
    aload
    0.06
    ΩΣ
    0.06
     indebted
    0.06
    	lcd
    0.06
    式会社
    0.06
    Act Density 0.026%

    No Known Activations