INDEX
    Explanations

    words or phrases related to uncovering or discovering information

    references to clues or hints related to mysteries or investigations

    New Auto-Interp
    Negative Logits
    lav
    -0.74
    rifice
    -0.74
    rior
    -0.71
    kus
    -0.69
    sburgh
    -0.68
    ategory
    -0.65
     Turks
    -0.64
    rik
    -0.64
    rie
    -0.64
    roc
    -0.63
    POSITIVE LOGITS
     clue
    1.12
     hint
    0.95
     clues
    0.90
     glean
    0.78
     hints
    0.72
    hole
    0.71
    hig
    0.69
    wcs
    0.67
     detector
    0.67
    hooting
    0.66
    Act Density 0.022%

    No Known Activations