INDEX
    Explanations

    references to items or concepts indicated by demonstrative pronouns

    New Auto-Interp
    Negative Logits
     Kleidung
    -0.63
     informaci
    -0.61
    zieży
    -0.55
     gouttes
    -0.55
     pouvoirs
    -0.55
    чик
    -0.54
     barnet
    -0.53
     गया
    -0.52
     gouvernements
    -0.51
     thư
    -0.49
    POSITIVE LOGITS
     Theſe
    1.11
    NameInMap
    0.99
     theses
    0.98
     autorytatywna
    0.96
     Theses
    0.94
     للاسماء
    0.94
     these
    0.93
    Những
    0.91
    These
    0.90
    %")
    0.88
    Act Density 0.157%

    No Known Activations