INDEX
    Explanations

    references to social and cultural phenomena

    New Auto-Interp
    Negative Logits
     overall
    -0.20
    969
    -0.15
    overall
    -0.15
     initially
    -0.15
     yani
    -0.14
     initial
    -0.14
     özellikle
    -0.14
    907
    -0.14
     adulte
    -0.14
    ertas
    -0.14
    POSITIVE LOGITS
    instead
    0.19
    while
    0.19
     while
    0.19
     instead
    0.17
     WHILE
    0.17
    uzzi
    0.16
    _while
    0.15
    ecause
    0.15
     whilst
    0.15
    because
    0.15
    Act Density 0.726%

    No Known Activations