INDEX
Explanations
phrases related to completion or achievement
the word "is" in various contexts
New Auto-Interp
Negative Logits
Siber
-0.57
demos
-0.56
itiner
-0.56
pots
-0.56
Pharaoh
-0.56
masses
-0.54
pract
-0.54
apixel
-0.54
Rebell
-0.53
coff
-0.53
POSITIVE LOGITS
nown
0.92
ources
0.84
̶
0.83
ustainable
0.82
por
0.81
adi
0.79
lightly
0.79
âĹ
0.78
oda
0.76
-->
0.75
Activations Density 0.080%