INDEX
    Explanations

    words related to labeling or applying labels

    New Auto-Interp
    Negative Logits
    issance
    -0.72
    ctica
    -0.70
    skill
    -0.69
    aldo
    -0.66
    rouch
    -0.65
    vati
    -0.64
     Pradesh
    -0.62
    atana
    -0.61
    ashington
    -0.60
    isites
    -0.60
    POSITIVE LOGITS
     label
    0.87
     labels
    0.86
    mates
    0.83
    ovan
    0.80
    strip
    0.80
    cloth
    0.79
    printed
    0.77
    mate
    0.76
     Label
    0.76
    mark
    0.74
    Act Density 0.048%

    No Known Activations