INDEX
Explanations
phrases indicating dependency or reliance on a service or resource
New Auto-Interp
Negative Logits
inez
-0.18
angelo
-0.17
izontally
-0.16
stral
-0.16
lement
-0.16
dea
-0.15
orp
-0.15
óż
-0.15
itez
-0.15
vida
-0.14
POSITIVE LOGITS
upon
0.37
Upon
0.31
Upon
0.31
heavily
0.30
heav
0.29
upon
0.28
heavy
0.27
heavier
0.26
Heavy
0.24
Heavy
0.23
Activations Density 0.025%