INDEX
    Explanations

    prepositions and location-related terms

    prepositions and expressions of relationships in text

    New Auto-Interp
    Negative Logits
    hement
    -0.85
    ategory
    -0.80
    antine
    -0.79
    hower
    -0.69
    heastern
    -0.67
    icularly
    -0.66
     nodd
    -0.64
    icut
    -0.64
    nor
    -0.64
    ering
    -0.63
    POSITIVE LOGITS
     Humanity
    0.94
     Noise
    0.87
     Hate
    0.83
     Geek
    0.83
     Represent
    0.80
     Extrem
    0.78
     Difference
    0.77
     Computing
    0.76
     Anarch
    0.75
     Blind
    0.75
    Act Density 0.279%

    No Known Activations