INDEX
    Explanations

    phrases that describe perceptions and qualities of objects or experiences

    New Auto-Interp
    Negative Logits
    });*/
    -0.75
    )");
    
    -0.70
    }));
    
    -0.70
     CanadaChoose
    -0.66
    "){
    -0.66
    ++){
    
    -0.66
    )";
    
    -0.66
    ")){
    
    -0.65
    )*/
    -0.65
    ("")]
    
    -0.65
    POSITIVE LOGITS
    looks
    1.12
    Looks
    1.09
     Looks
    1.09
     looks
    1.08
     appear
    0.96
     APPE
    0.94
     sounded
    0.93
     appears
    0.93
     sounding
    0.90
     looked
    0.88
    Act Density 0.174%

    No Known Activations