INDEX
    Explanations

    occurrences of pronouns and related predicate constructions

    New Auto-Interp
    Negative Logits
    Portail
    -0.42
    TestingModule
    -0.41
     compositeur
    -0.40
     Cots
    -0.37
    UNUSED
    -0.37
     ū
    -0.36
     paddling
    -0.35
     barras
    -0.35
     Мексичка
    -0.34
     تضيفلها
    -0.34
    POSITIVE LOGITS
    0.65
     للاسماء
    0.57
    devamını
    0.55
    0.54
    将她
    0.54
     zijne
    0.53
     Ее
    0.52
     ją
    0.51
     אותו
    0.50
    orghini
    0.50
    Act Density 0.024%

    No Known Activations