INDEX
    Explanations

    pronoun followed by verb phrase

    New Auto-Interp
    Negative Logits
     violations
    0.50
     aviation
    0.48
    unting
    0.47
    0.46
    essä
    0.45
     seepage
    0.44
     प्रतिकूल
    0.44
    そば
    0.43
    ageddon
    0.42
    0.42
    POSITIVE LOGITS
     for
    0.50
    з
    0.44
     cleanly
    0.42
     entspre
    0.42
    ેલ
    0.42
     compon
    0.41
    ونم
    0.41
     Gilroy
    0.41
    0.41
     definitively
    0.41
    Act Density 0.002%

    No Known Activations