INDEX
Explanations
adjectives indicating quality or evaluation
New Auto-Interp
Negative Logits
favors
-0.23
Favorite
-0.22
neighborhood
-0.22
Organizer
-0.21
Favorite
-0.21
neighborhoods
-0.21
neighbors
-0.21
favorite
-0.21
favorable
-0.21
unfavor
-0.20
POSITIVE LOGITS
cracking
0.25
adverts
0.23
£
0.23
programme
0.23
postcode
0.22
flavours
0.22
whilst
0.22
organisations
0.22
personalised
0.22
recognised
0.22
Activations Density 0.292%