INDEX
Explanations
phrases related to gaining or having control over a situation or entity
New Auto-Interp
Negative Logits
o
-0.16
olph
-0.14
elli
-0.14
ussian
-0.14
ERA
-0.14
gums
-0.14
iste
-0.14
rada
-0.14
Äĵ
-0.13
eing
-0.13
POSITIVE LOGITS
аниÑĨ
0.20
cosa
0.15
igram
0.15
urge
0.15
ζη
0.15
ighton
0.14
ÄĽj
0.14
ÅĻen
0.14
еж
0.14
ERGE
0.14
Activations Density 0.021%