INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     Lorca
    -0.59
    expandindo
    -0.58
    wang
    -0.56
     étrangère
    -0.54
     eddy
    -0.54
     Wickham
    -0.54
     <<<<<<<<<<<<<<
    -0.53
    queryInterface
    -0.53
     Councils
    -0.52
     occidentale
    -0.52
    POSITIVE LOGITS
    enderror
    0.60
    HtmlAttribute
    0.57
     transfieras
    0.51
    h
    0.46
    __);
    0.44
     Cas
    0.44
    hér
    0.44
     Kat
    0.44
     Henn
    0.43
     
    0.42
    Act Density 0.000%

    No Known Activations