INDEX
Explanations
words related to staining, tarnishing, or damaging
words related to stains and their effects
New Auto-Interp
Negative Logits
plex
-0.89
WER
-0.75
udeau
-0.75
oglu
-0.72
ipeg
-0.70
otide
-0.69
ctrl
-0.68
Disorder
-0.68
ÄŁ
-0.68
izoph
-0.68
POSITIVE LOGITS
stains
1.04
tarn
1.01
stain
1.00
stained
0.94
coating
0.92
corrosion
0.89
tint
0.88
Beir
0.87
acidic
0.83
lipstick
0.82
Activations Density 0.075%