INDEX
Explanations
various types of categories or classifications
phrases indicating variety or multiple categories of items
New Auto-Interp
Negative Logits
heid
-0.81
LAN
-0.74
edia
-0.71
adium
-0.70
Recomm
-0.68
hani
-0.67
Interstitial
-0.66
mented
-0.66
mary
-0.66
aeper
-0.66
POSITIVE LOGITS
etter
0.92
hots
0.81
chool
0.79
paces
0.77
pace
0.76
etting
0.76
afe
0.73
ucker
0.70
cue
0.66
ãĥĺ
0.66
Activations Density 0.016%