INDEX
Explanations
adjectives that describe something new or recently made
the concept of being new or revitalized
New Auto-Interp
Negative Logits
rael
-0.77
Donation
-0.70
idget
-0.68
oried
-0.66
ylum
-0.65
oris
-0.65
guided
-0.64
adian
-0.64
auga
-0.64
Ĥİ
-0.64
POSITIVE LOGITS
ness
1.18
lish
0.91
foundland
0.81
fresh
0.77
lings
0.76
scratch
0.75
breeze
0.73
batch
0.72
bie
0.72
water
0.71
Activations Density 0.025%