INDEX
    Explanations

    possibility

    The neuron fires on numeric measurement tokens (e.g. values with units or technical specs).

    New Auto-Interp
    Negative Logits
    aku
    -0.07
     easy
    -0.07
    rack
    -0.06
     grace
    -0.06
     runner
    -0.06
     arts
    -0.06
    book
    -0.06
     says
    -0.06
     Thanks
    -0.06
    -week
    -0.06
    POSITIVE LOGITS
     potentially
    0.09
    anyl
    0.08
     possibly
    0.08
    0.07
     hypothetical
    0.07
    entlich
    0.07
     baskı
    0.07
    0.07
    اص
    0.07
    (($
    0.07
    Act Density 0.012%

    No Known Activations