INDEX
Explanations
punctuation marks and certain sentence endings that signify either emphasis or concluding thoughts
New Auto-Interp
Negative Logits
oloj
-0.17
Æł
-0.15
važ
-0.15
UTTON
-0.14
mpar
-0.13
anga
-0.13
ANTE
-0.13
ante
-0.13
IVO
-0.13
ift
-0.12
POSITIVE LOGITS
Anyway
0.18
Anyway
0.17
Loc
0.14
ëĺIJ
0.14
ãģ¾ãģŁ
0.14
ählen
0.14
nu
0.14
loc
0.14
ayrıca
0.14
theless
0.14
Activations Density 0.535%