INDEX
    Explanations

    greetings followed by names or titles

    New Auto-Interp
    Negative Logits
     dvara
    0.59
     bidirectional
    0.54
     interrelated
    0.52
     hypothalamic
    0.52
     nontrivial
    0.51
    akati
    0.51
     vatth
    0.51
     idiosyncratic
    0.51
     ሌሎች
    0.51
    0.51
    POSITIVE LOGITS
     dear
    1.63
     sir
    1.54
    Dear
    1.34
     dearest
    1.34
    dear
    1.34
    sir
    1.29
     monsieur
    1.27
     جناب
    1.26
     Dear
    1.25
     Sir
    1.23
    Act Density 0.377%

    No Known Activations