INDEX
    Explanations

    interview-related keywords like "spoke", "talked", and "phone"

    instances of the word "spoke" or related terms indicating conversation

    New Auto-Interp
    Negative Logits
    ————
    -0.74
     ILCS
    -0.71
    anny
    -0.69
    helps
    -0.66
    aters
    -0.64
    zyk
    -0.63
    ~~~~
    -0.58
    inner
    -0.57
    ritional
    -0.57
    atever
    -0.56
    POSITIVE LOGITS
     extensively
    0.85
     briefly
    0.78
     charism
    0.72
     about
    0.72
     glow
    0.70
    afety
    0.69
    ebus
    0.67
    vice
    0.66
     separately
    0.64
     volumes
    0.64
    Act Density 0.054%

    No Known Activations