INDEX
Explanations
words or phrases related to importance or necessity
terms that denote significance or essentiality in various contexts
New Auto-Interp
Negative Logits
brids
-0.85
ULTS
-0.84
ravings
-0.83
bows
-0.78
riots
-0.78
books
-0.76
icons
-0.76
alks
-0.75
oops
-0.74
illions
-0.74
POSITIVE LOGITS
component
1.42
element
1.39
part
1.37
aspect
1.33
factor
1.29
ingredient
1.20
contributor
1.16
facet
1.15
feature
1.11
indicator
1.09
Activations Density 0.142%