INDEX
    Explanations

    referential pronouns and demonstratives

    demonstrative pronouns/determiners across languages

    New Auto-Interp
    Negative Logits
     Ause
    -0.59
     kasarigan
    -0.54
    HERO
    -0.50
    hors
    -0.50
     mouseClicked
    -0.50
     Scaling
    -0.49
     Installer
    -0.48
    nungszeiten
    -0.48
    -------
    -0.48
    ticides
    -0.47
    POSITIVE LOGITS
     THAT
    0.72
    That
    0.72
     That
    0.69
     same
    0.67
     that
    0.63
    THAT
    0.62
     aquello
    0.60
     Celui
    0.59
     того
    0.57
     том
    0.56
    Act Density 0.005%

    No Known Activations