INDEX
    Explanations

    the word "steel" at different levels of relevance, with some activations indicating a very strong match

    New Auto-Interp
    Negative Logits
     Kard
    -0.76
     Niet
    -0.74
     Garr
    -0.73
    romeda
    -0.72
     DOE
    -0.69
     [|
    -0.67
    itia
    -0.66
     Chomsky
    -0.64
    uate
    -0.64
    annah
    -0.63
    POSITIVE LOGITS
    works
    1.06
     wool
    1.05
    Series
    1.03
    workers
    1.00
    worker
    0.96
    steel
    0.94
    anguage
    0.91
    fish
    0.91
     beams
    0.87
    Steel
    0.87
    Act Density 0.020%

    No Known Activations