INDEX
Explanations
phrases indicating the presence of various subjects or themes in a context
New Auto-Interp
Negative Logits
cks
-0.17
aldi
-0.16
le
-0.16
lek
-0.15
ÂŃt
-0.15
Jose
-0.14
ve
-0.14
zar
-0.14
José
-0.13
346
-0.13
POSITIVE LOGITS
uch
0.18
áž
0.17
legen
0.17
DMIN
0.17
cus
0.16
gen
0.16
asma
0.16
ItemSelected
0.16
ustin
0.16
-toggler
0.16
Activations Density 0.036%