INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    OrNil
    -0.72
     SEDS
    -0.68
     հղումներ
    -0.68
    -------
    -0.67
    ண்டும்
    -0.65
    Ligações
    -0.65
     nhật
    -0.65
    NameInMap
    -0.63
    routeProvider
    -0.63
     esternos
    -0.63
    POSITIVE LOGITS
     thank
    0.86
    thank
    0.81
    Thank
    0.79
    eschön
    0.78
     Thank
    0.77
    Thanks
    0.76
     thanking
    0.75
     THANK
    0.74
     thanks
    0.74
     gracias
    0.74
    Act Density 0.044%

    No Known Activations