INDEX
    Explanations

    locations or directions mentioned in text

    occurrences of the word "from."

    New Auto-Interp
    Negative Logits
    ratulations
    -0.82
    rongh
    -0.74
    faced
    -0.73
    mask
    -0.72
    leneck
    -0.71
    isphere
    -0.70
    few
    -0.69
    important
    -0.67
    ascript
    -0.67
    certain
    -0.66
    POSITIVE LOGITS
     afar
    1.35
     whence
    1.14
     thence
    0.99
     abroad
    0.90
     scratch
    0.90
     inside
    0.84
     elsewhere
    0.81
     anywhere
    0.79
     within
    0.78
     somewhere
    0.76
    Act Density 0.235%

    No Known Activations