INDEX
    Explanations

    conjunctions followed by explanations

    New Auto-Interp
    Negative Logits
     l
    0.47
     máme
    0.46
     tenemos
    0.42
     kita
    0.41
    0.41
     n
    0.41
    ERO
    0.40
    0.40
    0.40
     máte
    0.40
    POSITIVE LOGITS
     thereby
    0.41
     inaccur
    0.40
     captivated
    0.40
    점에서
    0.40
     inextricably
    0.40
     troubled
    0.39
     narrowly
    0.39
     prompting
    0.39
     draped
    0.39
     noticeably
    0.38
    Act Density 0.942%

    No Known Activations