INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Majority
    -0.08
     fenced
    -0.07
     disgrace
    -0.07
     majority
    -0.07
    ел
    -0.07
     persons
    -0.07
    bije
    -0.07
    ам
    -0.07
    男女
    -0.07
    ammer
    -0.07
    POSITIVE LOGITS
     Yun
    0.09
     Rosa
    0.08
     Ries
    0.08
    .Entity
    0.08
    'une
    0.07
     occasional
    0.07
     เพล
    0.07
    .environment
    0.07
     Rigidbody
    0.07
     phía
    0.07
    Act Density 0.002%

    No Known Activations