INDEX
    Explanations

    question-answer pairs and question-like sentence structures

    New Auto-Interp
    Negative Logits
     how
    -0.91
     How
    -0.74
    howto
    -0.72
    How
    -0.71
    how
    -0.71
     HOW
    -0.69
    HOW
    -0.64
     Bagaimana
    -0.62
      (
    -0.59
     bagaimana
    -0.58
    POSITIVE LOGITS
     Does
    0.85
    Does
    0.82
     does
    0.74
     Are
    0.74
     Is
    0.73
     Will
    0.72
     Did
    0.68
    Will
    0.66
    Are
    0.66
     are
    0.66
    Act Density 1.171%

    No Known Activations