INDEX
Explanations
words related to superlatives or rankings
key attributes or features associated with rankings and evaluations
New Auto-Interp
Negative Logits
waivers
-0.68
flats
-0.67
Jackets
-0.66
¶ħ
-0.64
adoption
-0.62
employment
-0.61
Knights
-0.60
thood
-0.60
poisoning
-0.60
tubes
-0.60
POSITIVE LOGITS
ented
1.15
iable
1.07
ivable
1.07
entious
1.07
icable
1.06
isable
1.06
istent
1.05
ested
1.02
izable
1.02
urable
1.00
Activations Density 0.220%