INDEX
    Explanations

    locations mentioned with a high level of activation

    instances of the word "at" used in various contexts

    New Auto-Interp
    Negative Logits
    FTWARE
    -0.76
    alpha
    -0.70
    gravity
    -0.69
    Russ
    -0.67
    REDACTED
    -0.65
    HTTP
    -0.63
    istar
    -0.62
    Lens
    -0.62
     WRITE
    -0.61
    PE
    -0.61
    POSITIVE LOGITS
     least
    1.27
    onement
    0.98
    abase
    0.94
     halftime
    0.84
    rial
    0.82
    roph
    0.80
     times
    0.72
     liberty
    0.71
    oned
    0.70
    las
    0.70
    Act Density 0.242%

    No Known Activations