INDEX
    Explanations

    code / logs

    New Auto-Interp
    Negative Logits
    -0.07
    713
    -0.07
    εκ
    -0.06
    undred
    -0.06
    	light
    -0.06
    ávat
    -0.06
    dere
    -0.06
    -0.06
    ->__
    -0.06
    364
    -0.06
    POSITIVE LOGITS
     cultivation
    0.08
     spoilers
    0.07
    _WATER
    0.07
     chop
    0.07
     Rope
    0.07
     Nancy
    0.07
     upholstery
    0.06
    .Experimental
    0.06
     Things
    0.06
    .part
    0.06
    Act Density 0.298%

    No Known Activations