INDEX
    Explanations

    the abbreviation "OT" with increasing levels of intensity, reaching its highest activation with "OT" at 10

    instances of the token "OT," indicating a focus on out-of-target content or specific categories in a dataset

    New Auto-Interp
    Negative Logits
    bler
    -0.72
    uckland
    -0.68
     fixme
    -0.68
    ãĥ¼ãĥĨ
    -0.67
    plain
    -0.62
     mM
    -0.62
    fold
    -0.61
     nature
    -0.61
    antha
    -0.61
     versa
    -0.60
    POSITIVE LOGITS
    assium
    1.17
    TL
    1.02
    OGR
    1.00
    ECH
    0.97
    atoes
    0.89
    ION
    0.88
    TE
    0.88
    TO
    0.87
    OT
    0.86
    YP
    0.85
    Act Density 0.015%

    No Known Activations