INDEX
Explanations
words indicating actions and states of being
New Auto-Interp
Negative Logits
akin
-0.16
мÑĸнÑĸ
-0.15
/***/
-0.15
recated
-0.14
StringRef
-0.14
exas
-0.14
eling
-0.14
uvwxyz
-0.14
iq
-0.14
bane
-0.14
POSITIVE LOGITS
0.17
grad
0.16
387
0.15
Ki
0.15
braz
0.15
zin
0.15
Sher
0.15
ram
0.14
rms
0.14
yst
0.14
Activations Density 0.023%