INDEX
    Explanations

    win-win situations

    New Auto-Interp
    Negative Logits
    işti
    -0.06
     서로
    -0.06
    curso
    -0.06
     fulfillment
    -0.06
     cuer
    -0.06
     muy
    -0.06
     nas
    -0.06
    	attack
    -0.06
    ،
    -0.06
    θος
    -0.06
    POSITIVE LOGITS
     ppt
    0.06
     lựa
    0.06
     Shard
    0.06
     Scotch
    0.06
     dob
    0.06
     ζ
    0.06
    ")}
    0.06
    του
    0.06
    .getD
    0.06
     Scout
    0.06
    Act Density 0.051%

    No Known Activations