INDEX
    Explanations

    references to individuals in various contexts

    New Auto-Interp
    Negative Logits
    ValueStyle
    -0.89
    '},
    
    -0.81
    >");
    
    -0.81
    "},
    
    -0.81
    }`).
    -0.78
    '):
    
    -0.78
    ")){
    
    -0.77
    ]){
    
    -0.77
     />);
    -0.75
    >());
    -0.75
    POSITIVE LOGITS
     whom
    0.56
    whom
    0.54
    /*
    0.53
     kepada
    0.53
     calon
    0.53
     نزد
    0.53
    nocześnie
    0.52
     tegas
    0.50
     fellow
    0.49
    yscy
    0.49
    Act Density 0.227%

    No Known Activations