INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     lev
    -0.06
    zm
    -0.06
     outfit
    -0.06
     öğren
    -0.06
    (cv
    -0.06
     환산
    -0.06
     міста
    -0.06
    -expanded
    -0.06
     luxurious
    -0.05
    ética
    -0.05
    POSITIVE LOGITS
     Igor
    0.07
     imageURL
    0.07
     |:
    0.07
     bir
    0.06
    _dicts
    0.06
     tarih
    0.06
     bones
    0.06
    +')
    0.06
    icast
    0.06
    ']]
    0.06
    Act Density 0.087%

    No Known Activations