INDEX
    Explanations

    references to the word "reveal," indicating its focus is on detecting instances where something is uncovered or disclosed

    New Auto-Interp
    Negative Logits
     дописавши
    -0.85
     незавершена
    -0.70
    IndentedString
    -0.69
    NameInMap
    -0.68
     discovered
    -0.67
     EconPapers
    -0.66
     Discovered
    -0.66
    RenderAtEndOf
    -0.66
     Announced
    -0.64
     announced
    -0.63
    POSITIVE LOGITS
     reveal
    1.59
    reveal
    1.53
     Reveal
    1.39
    Reveal
    1.31
     revealing
    1.13
     reveals
    0.91
    reve
    0.76
     revelar
    0.68
     Reveals
    0.67
     révé
    0.65
    Act Density 0.004%

    No Known Activations