INDEX
    Explanations

    references to change and the contrasting nature of circumstances over time

    New Auto-Interp
    Negative Logits
     geprek
    -0.60
     autorytatywna
    -0.57
    ReusableCell
    -0.56
    ondern
    -0.49
     nonUne
    -0.48
    idency
    -0.47
     penghargaan
    -0.47
     TextInputType
    -0.47
    Dichloroprop
    -0.46
    roja
    -0.46
    POSITIVE LOGITS
     things
    0.81
     Everything
    0.76
    things
    0.73
     everything
    0.73
     Things
    0.71
    everything
    0.71
    Everything
    0.69
    Things
    0.67
     THINGS
    0.66
    THINGS
    0.57
    Act Density 0.191%

    No Known Activations