INDEX
Explanations
different kinds of items or categories
references to various categories or types of entities
New Auto-Interp
Negative Logits
yer
-0.66
heid
-0.65
bay
-0.65
Yel
-0.64
edia
-0.64
phot
-0.63
inal
-0.63
LAN
-0.62
ITNESS
-0.62
ouston
-0.60
POSITIVE LOGITS
etter
1.00
etting
0.97
kinds
0.87
sorts
0.80
pace
0.76
heartedly
0.73
coerc
0.71
ername
0.70
omething
0.70
ometimes
0.70
Activations Density 0.008%