INDEX
    Explanations

    references to originality or original content

    references to original content or originality in various contexts

    New Auto-Interp
    Negative Logits
    ucket
    -0.72
     Simulator
    -0.70
    wal
    -0.69
    ega
    -0.67
    roph
    -0.66
    rom
    -0.66
    angs
    -0.65
    rolet
    -0.65
    walk
    -0.64
    ower
    -0.63
    POSITIVE LOGITS
    ity
    1.41
    ITY
    1.10
    izations
    0.95
    ities
    0.86
    ité
    0.85
    lly
    0.85
    itiz
    0.82
    itized
    0.81
    iator
    0.80
    smanship
    0.79
    Act Density 0.020%

    No Known Activations