INDEX
    Explanations

    phrases that indicate creation, ownership, or reference to content and its management

    New Auto-Interp
    Negative Logits
     ones
    -0.17
    472
    -0.16
    à¥įद
    -0.16
    ä¸Ģ个
    -0.15
     mini
    -0.15
    ibles
    -0.15
    ones
    -0.15
    olls
    -0.14
     Mus
    -0.14
     number
    -0.14
    POSITIVE LOGITS
     pieces
    0.20
     piece
    0.18
    Pieces
    0.17
    pieces
    0.16
     Pieces
    0.16
     ÙħÙĤدار
    0.16
     Rounds
    0.15
    Piece
    0.15
    stuff
    0.15
     Amount
    0.14
    Act Density 0.306%

    No Known Activations