INDEX
    Explanations

    emphasized references to significant weight or heaviness across various contexts

    New Auto-Interp
    Negative Logits
    jour
    -0.17
    ège
    -0.16
    atatype
    -0.15
    icina
    -0.14
    osen
    -0.14
    Ïħ
    -0.13
    HeaderValue
    -0.13
    _UTF
    -0.13
    enberg
    -0.13
    ï¸ı
    -0.13
    POSITIVE LOGITS
    -duty
    0.51
     duty
    0.42
     Duty
    0.39
    weights
    0.39
     hitters
    0.29
    -weight
    0.29
    weight
    0.29
    -handed
    0.27
     hitter
    0.26
    wie
    0.25
    Act Density 0.019%

    No Known Activations