INDEX
    Explanations

    table of contents, weekly, translation, strategy

    New Auto-Interp
    Negative Logits
     مساعد
    0.51
    hoa
    0.46
    0.45
    shad
    0.45
    Constit
    0.44
     वैर
    0.43
     수는
    0.43
    (
    0.42
    StringSet
    0.42
    cules
    0.42
    POSITIVE LOGITS
    0.55
    0.54
    0.50
    <0xB2>
    0.49
    0.49
    nél
    0.48
    ешь
    0.48
     wreck
    0.45
     volontà
    0.44
    at
    0.43
    Act Density 0.001%

    No Known Activations