INDEX
    Explanations

    answering complex subjective questions

    New Auto-Interp
    Negative Logits
     nếu
    0.81
     यदि
    0.80
     اگر
    0.80
     if
    0.78
    いましたが
    0.78
    였습니다
    0.77
    τοι
    0.77
    っていました
    0.76
    إذا
    0.74
    その後
    0.73
    POSITIVE LOGITS
     philosophers
    0.80
     everyone
    0.78
     Humanities
    0.77
     humanities
    0.76
     emocional
    0.75
     nessuna
    0.74
     politicians
    0.74
     filóso
    0.73
     nessun
    0.73
    Nobody
    0.73
    Act Density 0.208%

    No Known Activations