INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     Destiny
    -0.08
    (+
    -0.08
    (goal
    -0.07
    ifikation
    -0.07
    รู้
    -0.07
    性生活
    -0.07
    (connection
    -0.07
    	  
    -0.07
     çıkt
    -0.07
    POSITIVE LOGITS
    Behind
    0.11
     behind
    0.10
     Behind
    0.10
     पर्द
    0.10
    幕后
    0.09
    Walls
    0.09
     وراء
    0.09
     cach
    0.09
     Walls
    0.09
     derrière
    0.09
    Act Density 0.017%

    No Known Activations