INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tin
    -0.10
    Х
    -0.08
     studied
    -0.08
    züge
    -0.08
    ασ
    -0.08
     swivel
    -0.07
     تج
    -0.07
    Election
    -0.07
    بحث
    -0.07
    ırl
    -0.07
    POSITIVE LOGITS
    路线
    0.09
    用品
    0.09
     sightseeing
    0.08
     Visa
    0.08
    .visit
    0.08
     turismo
    0.08
    allu
    0.08
    0.08
     atrak
    0.08
     Knot
    0.07
    Act Density 0.010%

    No Known Activations