INDEX
    Explanations

    words associated with gain or loss in competitive scenarios

    New Auto-Interp
    Negative Logits
    "]);
    
    -0.94
    '],
    
    -0.91
    "];
    
    -0.87
    ")){
    
    -0.85
    "],
    
    -0.84
    ']);
    
    -0.84
    ]));
    
    -0.84
    '))
    
    -0.84
    '));
    
    -0.83
    ")));
    
    -0.83
    POSITIVE LOGITS
    PostConstruct
    0.73
     —
    0.68
    wakili
    0.61
    0.59
    0.58
     nahilalakip
    0.56
    harusnya
    0.54
    PickerController
    0.54
     včetně
    0.53
     stalo
    0.52
    Act Density 0.503%

    No Known Activations