INDEX
    Explanations

    the word "dam" and words starting with "dam"

    New Auto-Interp
    Negative Logits
     Mario
    -0.57
    راقي
    -0.57
    lecte
    -0.56
    PACE
    -0.56
     نوز
    -0.56
    HomeAsUp
    -0.55
     No
    -0.55
     gea
    -0.55
    aternary
    -0.54
    ukkah
    -0.54
    POSITIVE LOGITS
     dams
    1.70
     dam
    1.63
     Dam
    1.59
    Dam
    1.53
    dam
    1.44
     Dams
    1.40
     DAM
    1.33
    dams
    1.20
    DAM
    1.13
     Damon
    1.05
    Act Density 0.005%

    No Known Activations