INDEX
    Explanations

    references to locations or places

    repeated mentions of specific locations or places

    New Auto-Interp
    Negative Logits
     pedal
    -0.67
     Ratio
    -0.61
    Arab
    -0.61
    Rod
    -0.61
     decap
    -0.60
     indoctr
    -0.60
     list
    -0.59
     determined
    -0.59
     arming
    -0.59
     lament
    -0.59
    POSITIVE LOGITS
    oa
    3.80
    uu
    1.80
    ua
    1.33
    owa
    1.21
    oji
    1.08
     Gaga
    1.00
    ui
    0.97
    aho
    0.95
    oj
    0.94
    anta
    0.94
    Act Density 0.007%

    No Known Activations