INDEX
    Explanations

    words expressing strong emotions or preferences towards certain actions or entities

    expressions of affection and aversion

    New Auto-Interp
    Negative Logits
    phabet
    -0.81
    EStreamFrame
    -0.77
    rome
    -0.75
    externalActionCode
    -0.72
    harm
    -0.69
    soType
    -0.66
    nw
    -0.65
    level
    -0.65
    levels
    -0.65
    aqu
    -0.64
    POSITIVE LOGITS
     themselves
    0.67
     Pigs
    0.61
     revenge
    0.59
     foreigners
    0.59
     passionately
    0.59
     sticking
    0.58
     eagerly
    0.58
     outsiders
    0.58
     importing
    0.56
     advertising
    0.56
    Act Density 0.319%

    No Known Activations