INDEX
    Explanations

    articles and determiners in various languages

    New Auto-Interp
    Negative Logits
     itſelf
    -0.81
     "$@"
    -0.71
     nephe
    -0.70
     Houſe
    -0.70
     CreateTagHelper
    -0.69
     versace
    -0.69
     Yugos
    -0.66
     creș
    -0.65
     ―――――
    -0.65
     philosop
    -0.65
    POSITIVE LOGITS
    The
    1.19
     The
    1.12
     La
    0.96
     the
    0.96
     THE
    0.94
    Οι
    0.93
     la
    0.85
    THE
    0.83
    La
    0.83
    Το
    0.82
    Act Density 0.061%

    No Known Activations