INDEX
    Explanations

    phrases or terms indicating a range or variety of subjects or topics

    phrases indicating a range or variety of topics or conditions

    New Auto-Interp
    Negative Logits
    mit
    -0.79
    imposed
    -0.75
    spot
    -0.74
    clusions
    -0.70
    illin
    -0.70
    ÄŁ
    -0.69
    driving
    -0.69
    rafted
    -0.67
    template
    -0.67
    yles
    -0.66
    POSITIVE LOGITS
     ranging
    0.91
    nesota
    0.65
     unlaw
    0.65
     Luxem
    0.64
    ãĤ¤ãĥĪ
    0.63
    ranging
    0.62
     vari
    0.62
    ĸļ
    0.61
    ogyn
    0.61
     upwards
    0.61
    Act Density 0.020%

    No Known Activations