INDEX
Explanations
phrases that highlight emphasis or importance
New Auto-Interp
Negative Logits
136
-0.15
ulas
-0.15
ract
-0.15
chet
-0.15
iska
-0.14
ä»¶
-0.14
adero
-0.14
idend
-0.14
glob
-0.14
zin
-0.14
POSITIVE LOGITS
importance
0.25
Importance
0.23
emphasis
0.23
phasis
0.22
phas
0.22
emphasis
0.20
uated
0.20
upon
0.18
emphasize
0.17
uating
0.17
Activations Density 0.032%