INDEX
    Explanations

    words related to averages and typical values

    New Auto-Interp
    Negative Logits
     DialogInterface
    -0.71
     sk
    -0.69
    ̀n
    -0.68
     re
    -0.67
    sp
    -0.66
     Je
    -0.65
     Kirk
    -0.64
     Gun
    -0.63
     Hel
    -0.63
     Robertson
    -0.63
    POSITIVE LOGITS
     Average
    1.87
     AVERAGE
    1.84
    Average
    1.83
     average
    1.83
    average
    1.81
     averages
    1.77
    AVERAGE
    1.77
     averaging
    1.67
    verages
    1.62
     Avg
    1.60
    Act Density 0.064%

    No Known Activations