INDEX
Explanations
contrasting phrases that indicate a shift in perspective
New Auto-Interp
Negative Logits
orgia
-0.16
aston
-0.14
rai
-0.14
csr
-0.14
gia
-0.14
eÄį
-0.14
-Token
-0.14
rig
-0.14
ustum
-0.14
/ajax
-0.14
POSITIVE LOGITS
merely
0.18
онÑĥ
0.17
ala
0.17
meer
0.16
zen
0.16
ãĥIJãĥ¼
0.15
nor
0.15
ts
0.15
ones
0.15
ajar
0.15
Activations Density 0.031%