INDEX
Explanations
words related to erasing or removing something
words related to the concept of error or mistakes
New Auto-Interp
Negative Logits
xual
-0.70
Spears
-0.69
ership
-0.68
ertodd
-0.66
Premium
-0.65
Painter
-0.63
hyde
-0.63
pring
-0.62
SHIP
-0.62
hips
-0.62
POSITIVE LOGITS
asure
1.22
asures
1.07
icit
1.02
aser
0.97
asing
0.90
rant
0.88
ogenous
0.88
bers
0.85
got
0.83
rors
0.82
Activations Density 0.009%