INDEX
    Explanations

    conjunctions and coordinating phrases that connect ideas

    New Auto-Interp
    Negative Logits
    æīįèĥ½
    -0.15
    ains
    -0.15
    ковÑĸ
    -0.13
    ocu
    -0.13
    ICE
    -0.13
    andom
    -0.13
    ãĥĮ
    -0.13
    ugins
    -0.13
    urus
    -0.13
    odef
    -0.13
    POSITIVE LOGITS
     nor
    0.54
    nor
    0.43
     Nor
    0.40
    Nor
    0.35
     neither
    0.35
    ä¹Łä¸į
    0.28
     NOR
    0.25
     Neither
    0.23
     ноÑĢ
    0.20
    Neither
    0.20
    Act Density 0.240%

    No Known Activations