INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    phrine
    -0.58
    granate
    -0.49
    AllowUser
    -0.45
    \"");
    -0.44
    miert
    -0.42
    ittens
    -0.41
    cency
    -0.41
     Seguro
    -0.41
    fandom
    -0.41
     henne
    -0.41
    POSITIVE LOGITS
     golf
    1.72
    golf
    1.63
    Golf
    1.62
     Golf
    1.62
     golfing
    1.58
     GOLF
    1.53
     golfer
    1.52
     golfers
    1.43
    GOLF
    1.40
     PGA
    1.09
    Act Density 0.095%

    No Known Activations