INDEX
Explanations
proper nouns or names
references to specific individuals or entities, particularly names associated with various contexts
New Auto-Interp
Negative Logits
rers
-0.80
cru
-0.76
bre
-0.76
rum
-0.76
pole
-0.72
quickShipAvailable
-0.71
shape
-0.70
wolves
-0.70
pine
-0.69
rer
-0.69
POSITIVE LOGITS
osit
0.94
wana
0.87
oser
0.83
osate
0.82
ylon
0.80
oba
0.78
nect
0.74
ĺħ
0.74
itton
0.73
ça
0.73
Activations Density 0.028%