INDEX
Explanations
phrases related to personal preferences and tastes
New Auto-Interp
Negative Logits
orges
-0.16
umba
-0.15
üzel
-0.14
ActionTypes
-0.14
à¤Ľ
-0.14
labore
-0.14
ripp
-0.14
obil
-0.13
rous
-0.13
opyright
-0.13
POSITIVE LOGITS
taste
0.74
tastes
0.69
Taste
0.56
tasted
0.44
preferences
0.42
tast
0.39
вкÑĥ
0.39
preference
0.37
liking
0.33
preferences
0.33
Activations Density 0.085%