INDEX
Explanations
phrases expressing degrees of approximation or near certainty
New Auto-Interp
Negative Logits
ró
-0.18
orem
-0.15
uty
-0.15
lements
-0.15
/plain
-0.15
iÄĻ
-0.15
emmel
-0.14
athers
-0.14
.datas
-0.14
jug
-0.14
POSITIVE LOGITS
entirely
0.17
exclusively
0.16
atical
0.15
arians
0.15
ny
0.15
ewan
0.14
avian
0.14
_NOTICE
0.14
æł·çļĦ
0.13
_almost
0.13
Activations Density 0.024%