INDEX
Explanations
words related to physical hygiene and cleanliness
New Auto-Interp
Negative Logits
ramer
-0.38
ertain
-0.37
rals
-0.36
yers
-0.35
livest
-0.34
raine
-0.32
lov
-0.32
elsen
-0.32
rn
-0.31
binding
-0.31
POSITIVE LOGITS
flush
0.54
flushed
0.51
toilets
0.43
Meadows
0.38
cheeks
0.37
blush
0.35
nery
0.33
itant
0.33
ting
0.32
ishable
0.32
Activations Density 9.868%