INDEX
Explanations
phrases expressing a contrast or negation
phrases indicating decreasing quantity or comparison
New Auto-Interp
Negative Logits
hill
-0.67
hurd
-0.66
dexter
-0.66
cox
-0.65
encies
-0.63
Run
-0.63
arthed
-0.61
aston
-0.61
iencies
-0.60
ricks
-0.60
POSITIVE LOGITS
anship
0.84
tainment
0.73
cially
0.71
Solitaire
0.70
ports
0.66
Levant
0.64
atically
0.64
Locations
0.61
Digest
0.61
اÙĦ
0.60
Activations Density 0.360%