INDEX
    Explanations

    words related to comparisons

    instances of the empty token or breaks in text flow

    New Auto-Interp
    Negative Logits
    stood
    -0.68
     Azerb
    -0.63
    anamo
    -0.62
    edIn
    -0.62
     Seym
    -0.61
     Clarkson
    -0.59
    emale
    -0.58
    egu
    -0.58
     surn
    -0.58
     Roe
    -0.58
    POSITIVE LOGITS
     ]
    0.81
     ][
    0.78
     ].
    0.74
     )
    0.73
     );
    0.68
     ];
    0.66
    Í
    0.65
     ).
    0.65
     ):
    0.65
     ><
    0.65
    Act Density 0.221%

    No Known Activations