INDEX
    Explanations

    words or phrases related to explanations, particularly using 'i.e.' or 'e.g.' as a signal

    occurrences of the letter 'e'

    New Auto-Interp
    Negative Logits
    theless
    -0.68
    ONSORED
    -0.59
     tomat
    -0.53
     ages
    -0.52
     blows
    -0.52
     Clicker
    -0.51
     solder
    -0.50
     centre
    -0.50
     blat
    -0.49
     intimid
    -0.49
    POSITIVE LOGITS
    .,
    1.96
    .:
    1.43
    .;
    1.39
    .?
    1.21
    .).
    1.17
    .,"
    1.16
    .),
    1.12
    .):
    1.00
    .-
    0.92
    .—
    0.91
    Act Density 0.020%

    No Known Activations