INDEX
    Explanations

    references to regulations and standards in various contexts

    New Auto-Interp
    Negative Logits
     Proud
    -0.15
    ADDE
    -0.14
    dde
    -0.14
    filer
    -0.14
    oot
    -0.14
    ìĸµ
    -0.13
     wsp
    -0.13
    ç«ĭãģ¦
    -0.13
    aiser
    -0.13
     fascinated
    -0.13
    POSITIVE LOGITS
     welcome
    0.50
     welcomed
    0.39
    welcome
    0.39
     Welcome
    0.37
    Welcome
    0.35
     welcomes
    0.32
    /welcome
    0.30
     good
    0.29
     welcoming
    0.28
    elcome
    0.28
    Act Density 0.178%

    No Known Activations