INDEX
    Explanations

    words ending in -ton, -inder, -ier

    New Auto-Interp
    Negative Logits
    (
    0.50
    :
    0.48
    \
    0.45
    的時候
    0.43
    0.41
    的时候
    0.41
    0.41
    లో
    0.40
     συγκεκρι
    0.40
     대회
    0.40
    POSITIVE LOGITS
    Phir
    0.50
    𐱃
    0.47
    0.46
    0.46
     omnis
    0.46
    IIUM
    0.46
     indifferent
    0.45
     unmanned
    0.44
     poesía
    0.44
     Parser
    0.44
    Act Density 0.007%

    No Known Activations