INDEX
Explanations
positive adjectives describing quality or appearance
New Auto-Interp
Negative Logits
getRule
-0.66
DebuggerNonUser
-0.59
drept
-0.53
hingga
-0.52
Зноскі
-0.52
depoz
-0.52
ようになります
-0.50
eraard
-0.49
bodem
-0.49
Pratique
-0.49
POSITIVE LOGITS
ties
0.85
nice
0.82
neat
0.78
touches
0.74
smelling
0.73
touch
0.72
NICE
0.72
little
0.72
TOUCH
0.72
tidy
0.70
Activations Density 0.057%