INDEX
    Explanations

    questions or prompts

    questions that prompt reflection or inquiry about various topics

    New Auto-Interp
    Negative Logits
     fullest
    -0.78
    éĹĺ
    -0.77
    ufact
    -0.76
    fulness
    -0.71
    ãĥ¼ãĥ«
    -0.69
    wagen
    -0.69
     hearts
    -0.67
    fitting
    -0.67
    weights
    -0.67
     fuels
    -0.64
    POSITIVE LOGITS
    onga
    0.98
    atar
    0.91
    addafi
    0.88
    aho
    0.87
    omi
    0.86
     Expand
    0.85
    WER
    0.85
    agan
    0.81
    ihu
    0.79
    iao
    0.79
    Act Density 0.013%

    No Known Activations