INDEX
    Explanations

    phrases descriptive of features and evaluations in reviews, especially related to video games

    New Auto-Interp
    Negative Logits
     ivi
    -1.59
     emphat
    -1.54
     accla
    -1.49
     volunte
    -1.43
     increa
    -1.41
     embra
    -1.41
     apprehen
    -1.41
     philanth
    -1.40
     reluct
    -1.39
     Confe
    -1.39
    POSITIVE LOGITS
     regarding
    0.62
    spli
    0.61
     when
    0.60
     towards
    0.59
     toward
    0.59
    CONFLICT
    0.58
     in
    0.58
     for
    0.58
     among
    0.57
     between
    0.57
    Act Density 0.407%

    No Known Activations