INDEX
Explanations
adjectives describing extremes or intensities
adjectives indicating extremes or notable characteristics
New Auto-Interp
Negative Logits
mun
-0.82
skirts
-0.74
è£ħ
-0.72
Þ
-0.71
teasp
-0.70
clair
-0.69
bara
-0.69
Lyn
-0.67
isha
-0.67
ADS
-0.66
POSITIVE LOGITS
imaginable
1.11
thing
0.92
manifestation
0.91
predictor
0.81
conceivable
0.80
exponent
0.76
culprit
0.76
ever
0.75
hest
0.73
possible
0.73
Activations Density 0.097%