INDEX
Explanations
phrases that prompt critical thinking or consideration
New Auto-Interp
Negative Logits
ecycle
-0.16
arness
-0.15
лÑĥ
-0.14
اÙĦØ´
-0.14
kker
-0.14
owan
-0.14
άνÏī
-0.14
Hacker
-0.13
hait
-0.13
ãĤ¸ãĤ¢
-0.13
POSITIVE LOGITS
yourself
0.16
ance
0.15
593
0.14
hra
0.14
inge
0.14
567
0.14
672
0.14
htdocs
0.13
/Set
0.13
tout
0.13
Activations Density 0.092%