INDEX
    Explanations

    the phrase "after" followed by numbers or modifiers indicating time

    New Auto-Interp
    Negative Logits
    zelf
    -0.15
    ões
    -0.14
    ÑĢажд
    -0.14
    åĢij
    -0.14
    vÃŃ
    -0.14
     âĹĦ
    -0.14
    GIN
    -0.14
    ErrorException
    -0.14
    æķ¬
    -0.14
    uchen
    -0.13
    POSITIVE LOGITS
    wards
    0.41
    ward
    0.40
    words
    0.40
    word
    0.34
    WARDS
    0.34
     wards
    0.33
    thought
    0.32
    effects
    0.31
    no
    0.29
    WARD
    0.28
    Act Density 0.113%

    No Known Activations