INDEX
Explanations
conditional phrases related to events and actions
New Auto-Interp
Negative Logits
are
-0.18
ought
-0.17
doit
-0.16
ieron
-0.16
ÏĥÏĦα
-0.16
is
-0.15
was
-0.15
ãĥ¼ãĥĨãĤ£
-0.15
ilio
-0.15
adol
-0.14
POSITIVE LOGITS
fos
0.41
fos
0.32
fuera
0.29
haya
0.22
isse
0.22
uisse
0.22
foss
0.22
hub
0.22
asse
0.22
olsun
0.20
Activations Density 0.020%