INDEX
Explanations
phrases that emphasize any variety of situations or elements
New Auto-Interp
Negative Logits
quelques
-0.16
Some
-0.16
ÙĬÙĩ
-0.15
some
-0.15
ania
-0.15
ä¸ĢäºĽ
-0.15
reira
-0.14
еÑı
-0.14
ped
-0.14
rej
-0.14
POSITIVE LOGITS
/all
0.33
ones
0.30
THING
0.30
place
0.28
sort
0.25
kind
0.24
one
0.23
thin
0.23
kind
0.22
ONE
0.22
Activations Density 0.099%