INDEX
Explanations
phrases indicating the present time context
New Auto-Interp
Negative Logits
yc
-0.17
865
-0.16
leston
-0.15
_geom
-0.14
RIA
-0.14
illez
-0.14
cete
-0.14
ibbon
-0.14
änner
-0.14
/root
-0.14
POSITIVE LOGITS
bo
0.16
wr
0.15
Cres
0.14
irre
0.14
ãĥ³ãĥķ
0.13
pective
0.13
IMUM
0.13
imon
0.13
imin
0.13
dire
0.13
Activations Density 0.024%