INDEX
    Explanations

    questions about various topics

    questions related to societal issues and personal beliefs

    New Auto-Interp
    Negative Logits
    .}
    -0.74
    "},
    -0.69
     ().
    -0.68
    Firstly
    -0.66
    ©¶æ¥µ
    -0.66
    .","
    -0.65
    $.
    -0.65
    .;
    -0.65
     ();
    -0.64
    }.
    -0.64
    POSITIVE LOGITS
    ?:
    1.46
    ?
    1.44
    ?",
    1.43
    ?"
    1.39
    ?'
    1.38
    ?".
    1.37
    ?).
    1.32
    ?),
    1.30
    ...?
    1.30
    ?'"
    1.29
    Act Density 0.429%

    No Known Activations