INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Tehran
    -0.09
     bedroom
    -0.09
    Expiration
    -0.08
     flask
    -0.08
     offices
    -0.08
     ensuite
    -0.08
     tinder
    -0.08
     investor
    -0.07
    Seattle
    -0.07
    աց
    -0.07
    POSITIVE LOGITS
    games
    0.08
     الألعاب
    0.08
     Def
    0.08
    DEF
    0.08
    obora
    0.08
    .eu
    0.08
     surprising
    0.07
    Defs
    0.07
    Def
    0.07
    dol
    0.07
    Act Density 0.001%

    No Known Activations