INDEX
    Explanations

    references to "fool" or "fools" in various contexts

    New Auto-Interp
    Negative Logits
     onyx
    -0.52
     Onyx
    -0.50
     Nowak
    -0.50
     Aria
    -0.48
    asString
    -0.47
     Karina
    -0.46
     MIA
    -0.46
     atx
    -0.46
     Ona
    -0.45
     Parkes
    -0.45
    POSITIVE LOGITS
     Fool
    2.03
     fool
    1.93
    Fool
    1.91
    fool
    1.84
     Fools
    1.74
     fools
    1.66
     fooling
    1.14
     fooled
    1.00
     foolish
    0.92
    Fo
    0.89
    Act Density 0.003%

    No Known Activations