INDEX
Explanations
instances of the word "to" used for expressing intention or purpose
New Auto-Interp
Negative Logits
wart
-0.16
reich
-0.15
agen
-0.15
夫
-0.14
atar
-0.14
enton
-0.14
jes
-0.14
iew
-0.14
adel
-0.14
ickle
-0.14
POSITIVE LOGITS
because
0.21
because
0.17
karena
0.17
porque
0.16
anse
0.15
ERRU
0.15
omdat
0.15
缮ãĤĴ
0.15
поÑĤомÑĥ
0.14
Because
0.14
Activations Density 0.245%