INDEX
    Explanations

    references to reading or content recommendations

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.03
    2:0.04
    3:0.09
    4:0.08
    5:0.04
    6:0.20
    7:0.04
    8:0.03
    9:0.06
    10:0.08
    11:0.25
    Negative Logits
     grains
    -1.91
     scrolls
    -1.84
     Journals
    -1.80
     oats
    -1.79
     corros
    -1.78
     mathemat
    -1.62
     biotech
    -1.59
    Condition
    -1.56
     looting
    -1.53
     cereal
    -1.53
    POSITIVE LOGITS
     Blanc
    1.62
     Deer
    1.62
    vae
    1.57
     Pole
    1.57
     Luna
    1.56
     Osaka
    1.54
     Torres
    1.54
    ğ
    1.52
     Yoshi
    1.50
    ois
    1.50
    Act Density 0.002%

    No Known Activations