INDEX
    Explanations

    words related to personal or shared experiences

    references to personal experiences

    New Auto-Interp
    Negative Logits
     annex
    -0.71
    vous
    -0.68
    yright
    -0.67
     nod
    -0.66
    sub
    -0.65
    law
    -0.64
    corn
    -0.64
    roup
    -0.63
    inately
    -0.63
    cut
    -0.62
    POSITIVE LOGITS
     experiences
    1.02
     firsthand
    0.97
     experience
    0.90
     Experience
    0.89
     experien
    0.85
     Exper
    0.78
    iences
    0.76
     abroad
    0.72
    Experience
    0.72
    ional
    0.72
    Act Density 0.029%

    No Known Activations