INDEX
    Explanations

    descriptive adjectives followed by nouns

    New Auto-Interp
    Negative Logits
    কেই
    0.50
    0.37
     cosy
    0.36
    0.36
    0.35
     belieb
    0.35
    \}$.
    0.34
    but
    0.34
    ড়া
    0.34
     благоприят
    0.34
    POSITIVE LOGITS
     እንዲሁም
    0.46
     എന്നിവ
    0.45
     (!)
    0.37
     কাজটি
    0.35
    hearted
    0.35
    といった
    0.35
     pornography
    0.34
     whatnot
    0.34
                       
    0.33
     ഒന്ന്
    0.33
    Act Density 0.046%

    No Known Activations