INDEX
Explanations
phrases indicating superlatives or rankings, particularly related to quality or performance
phrases indicating ranking or being among the best in various categories
New Auto-Interp
Negative Logits
_-
-0.73
atari
-0.68
Items
-0.67
Query
-0.66
oult
-0.64
rod
-0.64
raid
-0.64
izoph
-0.62
ãĥĩ
-0.60
ALSE
-0.59
POSITIVE LOGITS
existence
1.22
Europe
1.14
history
1.12
terms
1.08
America
1.08
town
1.05
world
1.03
Africa
0.90
humankind
0.90
Asia
0.89
Activations Density 0.072%