INDEX
    Explanations

    words or phrases that have meanings in different languages

    definitions and meanings of words or terms

    New Auto-Interp
    Negative Logits
    YD
    -0.87
    ADS
    -0.84
    CSS
    -0.84
     Vaugh
    -0.81
    IRC
    -0.81
    Ns
    -0.81
    ETS
    -0.81
    HL
    -0.80
    JC
    -0.78
    AMS
    -0.78
    POSITIVE LOGITS
     "'
    0.94
     "
    0.91
     "(
    0.84
     "[
    0.82
     liar
    0.80
     ",
    0.79
     \"
    0.78
     servant
    0.78
     pleasure
    0.77
     foreigner
    0.77
    Act Density 0.094%

    No Known Activations