INDEX
Explanations
questions about various topics
questions related to societal issues and personal beliefs
New Auto-Interp
Negative Logits
.}
-0.74
"},
-0.69
().
-0.68
Firstly
-0.66
©¶æ¥µ
-0.66
.","
-0.65
$.
-0.65
.;
-0.65
();
-0.64
}.
-0.64
POSITIVE LOGITS
?:
1.46
?
1.44
?",
1.43
?"
1.39
?'
1.38
?".
1.37
?).
1.32
?),
1.30
...?
1.30
?'"
1.29
Activations Density 0.429%