INDEX
    Explanations

    numerical values preceded by the phrase "up to"

    phrases related to limits or maximums

    New Auto-Interp
    Negative Logits
    Dro
    -0.68
    Reviewer
    -0.63
    åº
    -0.62
    News
    -0.61
    Posted
    -0.61
     doesnt
    -0.60
    Tools
    -0.60
    Moving
    -0.60
    Leave
    -0.59
    Still
    -0.59
    POSITIVE LOGITS
     150
    1.05
     200
    1.01
     300
    0.97
     100
    0.97
     500
    0.95
     400
    0.95
     80
    0.94
     3000
    0.94
     1500
    0.94
     50
    0.94
    Act Density 0.043%

    No Known Activations