INDEX
    Explanations

    words related to large quantities or intensities

    instances of the word "output" and its variations

    New Auto-Interp
    Negative Logits
    tarian
    -0.68
     Clause
    -0.65
     Sharif
    -0.65
     roy
    -0.64
    è£ħ
    -0.63
     viol
    -0.62
    tone
    -0.62
    stan
    -0.62
     clauses
    -0.60
     statically
    -0.60
    POSITIVE LOGITS
    ouring
    1.45
    acing
    1.11
    oring
    1.03
    ours
    1.02
    ored
    1.02
    ifts
    1.00
    acement
    0.99
    acements
    0.96
    oured
    0.96
    orer
    0.95
    Act Density 0.051%

    No Known Activations