INDEX
Explanations
terms related to low-quality or undesirable items
references to "junk" in various contexts
New Auto-Interp
Negative Logits
Æ
-0.80
cing
-0.79
oral
-0.71
ndum
-0.70
mberg
-0.70
gypt
-0.68
idency
-0.67
sis
-0.65
ignty
-0.65
utic
-0.64
POSITIVE LOGITS
heap
0.98
junk
0.97
drawer
0.89
pile
0.85
ies
0.80
busters
0.79
0.78
unks
0.77
ãģı
0.77
etsu
0.77
Activations Density 0.010%