INDEX
    Explanations

    numbers, sodium, collection, characters, stakes, cats

    New Auto-Interp
    Negative Logits
     κατα
    0.48
     inductively
    0.47
     하면서
    0.47
     νο
    0.46
     과정을
    0.45
     controvers
    0.44
     식품
    0.44
    0.43
    ajaran
    0.43
     종합
    0.43
    POSITIVE LOGITS
    Cancelled
    0.52
    Push
    0.51
    Funny
    0.50
    ok
    0.50
    y
    0.49
    3
    0.49
    4
    0.48
    Tiene
    0.48
    Yes
    0.46
    detect
    0.46
    Act Density 0.000%

    No Known Activations