INDEX
    Explanations

    references to cartoons

    New Auto-Interp
    Negative Logits
    changes
    -0.80
    rity
    -0.73
    ivation
    -0.71
    ulia
    -0.69
     Availability
    -0.69
     Priv
    -0.68
    work
    -0.67
    locks
    -0.66
     Recovery
    -0.65
    Skill
    -0.64
    POSITIVE LOGITS
     cartoon
    3.71
     cartoons
    3.15
     Cartoon
    2.29
     caric
    2.05
     caricature
    1.91
     satir
    1.64
     comic
    1.57
     comics
    1.44
     satirical
    1.42
     animated
    1.32
    Act Density 0.021%

    No Known Activations