INDEX
Explanations
suffixes that typically denote nouns or processes
New Auto-Interp
Negative Logits
Kamp
-0.75
recomm
-0.69
Masquerade
-0.68
indal
-0.64
Tuc
-0.64
ONSORED
-0.63
\\\\\\\\
-0.63
Haram
-0.63
Wonderland
-0.61
CHAR
-0.61
POSITIVE LOGITS
etary
0.84
cious
0.81
incial
0.77
hesive
0.77
ents
0.74
erity
0.73
phrine
0.72
nces
0.72
onial
0.72
perate
0.71
Activations Density 0.011%