INDEX
    Explanations

    words related to favor or preference

    New Auto-Interp
    Negative Logits
     Schmitz
    -0.48
     Bernadette
    -0.47
     Schreiber
    -0.45
     Jules
    -0.45
    Christophe
    -0.44
     PC
    -0.43
     Kidd
    -0.43
    ¡¡
    -0.43
     nhật
    -0.43
     Carla
    -0.42
    POSITIVE LOGITS
     favor
    0.97
     favour
    0.94
     Favor
    0.93
     Fav
    0.93
    favor
    0.89
    Favor
    0.88
     favoring
    0.88
    Fav
    0.86
     favored
    0.85
     FAVOR
    0.85
    Act Density 0.012%

    No Known Activations