INDEX
    Explanations

    contractions of "it is" with high activations

    pronominal references to the possessive form "its."

    New Auto-Interp
    Negative Logits
     Thib
    -0.80
    rette
    -0.73
    ãĤ¹ãĥĪ
    -0.68
    eering
    -0.67
    Trend
    -0.66
    stad
    -0.65
    rum
    -0.65
    ij士
    -0.65
    roups
    -0.64
    ozy
    -0.64
    POSITIVE LOGITS
    ELF
    1.16
     own
    1.08
     predecessor
    0.89
    elf
    0.87
    self
    0.87
     apparent
    0.83
     predecessors
    0.82
    sembly
    0.82
     respective
    0.78
    asca
    0.78
    Act Density 0.091%

    No Known Activations