INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    vode
    -0.76
    Morte
    -0.74
    verdi
    -0.74
     Parce
    -0.73
     Coolidge
    -0.72
    mitian
    -0.72
    orph
    -0.72
    どり
    -0.71
    pepe
    -0.70
     themselves
    -0.70
    POSITIVE LOGITS
     has
    1.27
     Has
    1.17
     HAS
    1.17
    has
    1.01
    Has
    1.00
     is
    0.94
    HandlerContext
    0.93
    HAS
    0.90
     έχει
    0.87
    expandindo
    0.86
    Act Density 0.150%

    No Known Activations