INDEX
    Explanations

    weights in kilograms

    quantitative measures of weight

    New Auto-Interp
    Negative Logits
    gotten
    -0.74
    bidden
    -0.70
    protected
    -0.66
    western
    -0.66
    vice
    -0.66
    Ire
    -0.66
    href
    -0.65
    structed
    -0.65
    bes
    -0.62
     Dee
    -0.62
    POSITIVE LOGITS
    kg
    1.10
     kg
    1.01
    atsu
    0.83
    enger
    0.82
     kilograms
    0.80
    cup
    0.78
     ammonia
    0.77
    JV
    0.77
    ingly
    0.76
    vl
    0.75
    Act Density 0.005%

    No Known Activations