INDEX
    Explanations

    kitchen contexts and languages

    New Auto-Interp
    Negative Logits
     sadistic
    0.49
     dermat
    0.43
     barbar
    0.43
     distrust
    0.43
     perpetrated
    0.42
     virulent
    0.42
    🔞
    0.42
     algorith
    0.42
     まあ
    0.41
     mistrust
    0.40
    POSITIVE LOGITS
     kitchen
    1.20
     Kitchen
    1.11
    Kitchen
    1.10
    kitchen
    1.09
     keuken
    0.98
    0.96
     cozinha
    0.96
     kitchens
    0.95
    キッチン
    0.95
     кух
    0.95
    Act Density 0.038%

    No Known Activations