INDEX
    Explanations

    references to "thing" in various contexts and its implications

    New Auto-Interp
    Negative Logits
    isticated
    -0.82
     ***/
    -0.75
     
    -0.72
    nexpected
    -0.69
    اولة
    -0.68
    letal
    -0.68
    ']]
    -0.68
     Schatten
    -0.67
     "}
    -0.67
    ."</
    -0.67
    POSITIVE LOGITS
     thing
    2.05
     THING
    1.92
     Thing
    1.84
    Thing
    1.65
    thing
    1.44
    THING
    1.42
     thingy
    1.23
     coisa
    1.13
     cosa
    0.99
     thang
    0.82
    Act Density 0.054%

    No Known Activations