INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    очной
    -0.07
    akah
    -0.07
    ]==
    -0.07
    aşı
    -0.06
     pedestrian
    -0.06
    PHA
    -0.06
    anden
    -0.06
    	sleep
    -0.06
    acciones
    -0.06
    统计
    -0.06
    POSITIVE LOGITS
     firm
    0.13
     Firm
    0.10
     firms
    0.09
     firma
    0.08
    firm
    0.08
    hart
    0.07
    415
    0.07
    /testify
    0.07
    Dim
    0.07
    (class
    0.07
    Act Density 0.006%

    No Known Activations