INDEX
Explanations
the infinitive form of verbs, particularly those indicating actions or decisions
New Auto-Interp
Negative Logits
loff
-0.17
åĨĮ
-0.16
urator
-0.14
mons
-0.14
lation
-0.14
cken
-0.13
Opposition
-0.13
ocab
-0.13
tres
-0.13
Advisor
-0.13
POSITIVE LOGITS
empo
0.16
Lives
0.16
lut
0.15
erner
0.15
ornings
0.15
565
0.15
hower
0.14
zzle
0.14
LD
0.14
_MPI
0.14
Activations Density 0.013%