INDEX
    Explanations

    interrogative and comparative phrases

    New Auto-Interp
    Negative Logits
    iaux
    -0.08
    lify
    -0.08
    itionally
    -0.08
     %[
    -0.08
    isinden
    -0.07
    ricks
    -0.07
    icamente
    -0.07
    .esp
    -0.07
    zsche
    -0.07
    apus
    -0.07
    POSITIVE LOGITS
    /or
    0.07
    ness
    0.07
    ion
    0.07
    -to
    0.07
    sembl
    0.07
    -than
    0.07
    ocratic
    0.07
    rog
    0.07
    s
    0.06
    ment
    0.06
    Act Density 0.074%

    No Known Activations