INDEX
    Explanations

    references to different weights or weight-related terms

    references to weight in various contexts

    New Auto-Interp
    Negative Logits
    ICLE
    -0.84
    BLE
    -0.82
     VIS
    -0.73
     Noir
    -0.73
     Mb
    -0.73
     Suc
    -0.72
    xon
    -0.70
     Kidd
    -0.69
    UTE
    -0.69
     Gutenberg
    -0.68
    POSITIVE LOGITS
     weights
    1.05
    lifting
    1.03
    weight
    0.98
     weight
    0.96
     heaviest
    0.93
    weights
    0.92
     heavier
    0.89
     burdens
    0.89
    iless
    0.87
     lifting
    0.86
    Act Density 0.021%

    No Known Activations