INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ilation
    0.50
     malo
    0.49
    imaan
    0.46
    ndon
    0.45
    ilty
    0.44
    0.44
    quisition
    0.44
    varlak
    0.44
     sợ
    0.43
    0.43
    POSITIVE LOGITS
    transformers
    0.84
     Transformers
    0.84
     Hug
    0.81
     transformers
    0.79
    Transformers
    0.76
     hugging
    0.74
     hug
    0.72
    Hughes
    0.71
    🤗
    0.71
     Face
    0.70
    Act Density 0.013%

    No Known Activations