INDEX
    Explanations

    terms related to scale and magnitude

    New Auto-Interp
    Negative Logits
     Smithsonian
    -0.19
     Sunder
    -0.17
     Sweat
    -0.17
    iful
    -0.16
     Samantha
    -0.16
    ugins
    -0.15
     SVN
    -0.15
     Sierra
    -0.15
     Sidd
    -0.15
     Sandy
    -0.15
    POSITIVE LOGITS
     scale
    0.77
     Scale
    0.68
    scale
    0.65
     scales
    0.61
    -scale
    0.61
    Scale
    0.60
    _scale
    0.57
    .scale
    0.57
     SCALE
    0.56
    	scale
    0.50
    Act Density 0.081%

    No Known Activations