INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.38
    2:0.03
    3:0.02
    4:0.03
    5:0.21
    6:0.04
    7:0.01
    8:0.04
    9:0.06
    10:0.03
    11:0.02
    Negative Logits
    Enlarge
    -2.44
     rye
    -1.68
     Freeze
    -1.66
    riers
    -1.63
    toggle
    -1.63
     UNESCO
    -1.61
    ICS
    -1.61
    ************
    -1.61
     Fritz
    -1.60
     pengu
    -1.59
    POSITIVE LOGITS
    am
    2.82
    AM
    2.81
    amer
    2.54
    ams
    2.50
    amate
    2.37
    amac
    2.29
    amn
    2.25
    aml
    2.24
    amic
    2.17
    ammy
    2.13
    Act Density 0.011%

    No Known Activations