INDEX
    Explanations

    occurrences of the word "bath"

    New Auto-Interp
    Negative Logits
     desmotivaciones
    -1.05
    majánló
    -0.96
    <unused43>
    -0.96
     pinulongan
    -0.95
    <unused41>
    -0.95
    <pad>
    -0.94
    <unused17>
    -0.94
    <unused47>
    -0.94
    <unused42>
    -0.94
    <unused3>
    -0.94
    POSITIVE LOGITS
     Horn
    0.58
     prints
    0.56
    0.54
     package
    0.53
     download
    0.52
     Ras
    0.52
     picture
    0.50
    ,
    0.50
    0.49
    .
    0.48
    Act Density 0.211%

    No Known Activations