INDEX
    Explanations

    phrases related to personal qualities and characteristics

    statements related to emotional or psychological observations

    New Auto-Interp
    Negative Logits
    swick
    -0.88
    asio
    -0.70
    ESE
    -0.69
    wright
    -0.69
    afety
    -0.69
    hers
    -0.66
    SHIP
    -0.64
    stanbul
    -0.64
    bia
    -0.62
    thora
    -0.62
    POSITIVE LOGITS
     albeit
    0.92
    gradient
    0.85
     etc
    0.76
     encomp
    0.72
     economical
    0.71
     huh
    0.69
    sounding
    0.69
    minded
    0.68
    itar
    0.64
     contro
    0.64
    Act Density 0.289%

    No Known Activations