INDEX
    Explanations

    terms related to weight and weight changes

    New Auto-Interp
    Negative Logits
    ✨:
    -0.99
     nahilalakip
    -0.91
    omiast
    -0.91
     mitosis
    -0.88
     hostels
    -0.87
    expandindo
    -0.87
     bambú
    -0.86
    OGND
    -0.84
     Mij
    -0.83
     näin
    -0.83
    POSITIVE LOGITS
     weight
    1.70
    weight
    1.67
     Weight
    1.66
     weights
    1.64
    weights
    1.57
     WEIGHT
    1.54
    Weight
    1.45
    WEIGHT
    1.41
     Weights
    1.32
     weighting
    1.19
    Act Density 0.078%

    No Known Activations