INDEX
Explanations
words indicating uncertainty or hesitation
informal qualifiers expressing a degree of mediocrity or dissatisfaction
New Auto-Interp
Negative Logits
worthiness
-1.00
lain
-0.94
enance
-0.82
iens
-0.79
heid
-0.76
worthy
-0.74
atars
-0.73
upon
-0.72
HCR
-0.71
imentary
-0.69
POSITIVE LOGITS
darn
1.00
kinda
0.97
dunno
0.81
weird
0.79
shit
0.78
damn
0.78
silly
0.75
nect
0.75
bu
0.73
thing
0.73
Activations Density 0.016%