INDEX
Explanations
mentions or references to notable figures or political leaders
the word "at" in various contexts
New Auto-Interp
Negative Logits
Tsukuyomi
-0.75
SPONSORED
-0.72
ppelin
-0.69
vous
-0.65
¥ŀ
-0.64
stocks
-0.63
SER
-0.60
pter
-0.59
æĸ¹
-0.59
gravity
-0.59
POSITIVE LOGITS
abase
1.17
rix
1.15
rice
1.08
oday
1.08
mosp
1.01
hemat
1.00
rices
0.98
hens
0.97
rium
0.95
roph
0.95
Activations Density 0.038%