INDEX
    Explanations

    words related to standards or preferred methods of operation

    New Auto-Interp
    Negative Logits
    igh
    -0.82
    iture
    -0.78
    ividual
    -0.78
    inately
    -0.76
    iosyncr
    -0.74
    atto
    -0.73
    iry
    -0.72
    milo
    -0.72
    icious
    -0.71
    ifer
    -0.68
    POSITIVE LOGITS
    fare
    0.77
     Europeans
    0.72
    ward
    0.71
    soever
    0.70
    forward
    0.68
     THEY
    0.68
     they
    0.67
    backs
    0.65
    finding
    0.65
     norm
    0.65
    Act Density 0.019%

    No Known Activations