INDEX
    Explanations

    instances where there is a willingness to do something

    expressions of willingness or unwillingness to take action

    New Auto-Interp
    Negative Logits
    icle
    -0.87
    icles
    -0.76
    inas
    -0.74
    agraph
    -0.70
    adish
    -0.70
     Regions
    -0.69
     Sections
    -0.67
     Panther
    -0.66
     Anthem
    -0.65
     NCT
    -0.64
    POSITIVE LOGITS
     willingness
    1.05
     unwillingness
    0.93
    yip
    0.91
     guiActiveUn
    0.90
     attitude
    0.88
    ï¸
    0.86
    terday
    0.78
     willingly
    0.76
     stance
    0.75
     reluctance
    0.74
    Act Density 0.006%

    No Known Activations