INDEX
    Explanations

    phrases that assert superiority in various contexts

    New Auto-Interp
    Negative Logits
     Rossa
    -0.67
    RTGC
    -0.61
     Hoh
    -0.61
    twimg
    -0.58
     Ski
    -0.57
     Birch
    -0.55
     Bré
    -0.55
     ski
    -0.55
     plak
    -0.55
     Gör
    -0.54
    POSITIVE LOGITS
    ftagPool
    0.79
    PositiveButton
    0.74
    NegativeButton
    0.73
    }`;
    0.72
     referrerpolicy
    0.71
    )');
    0.69
    onViewCreated
    0.68
     traveler
    0.67
    elesaikan
    0.67
    }';
    0.66
    Act Density 0.065%

    No Known Activations