INDEX
Explanations
phrases indicating similarity or comparison
phrases that denote comparisons and similarities
New Auto-Interp
Negative Logits
squash
-0.77
bang
-0.69
Beng
-0.64
Dota
-0.62
Bund
-0.61
ELY
-0.59
demolition
-0.59
neighbourhood
-0.58
Rumble
-0.58
Derby
-0.57
POSITIVE LOGITS
chart
0.86
accompan
0.84
æ©Ł
0.82
ctr
0.78
wise
0.77
quartered
0.76
wcs
0.72
forward
0.70
initions
0.70
nown
0.70
Activations Density 0.026%