INDEX
Explanations
proper nouns related to brands, titles, or notable individuals
names and brands
New Auto-Interp
Negative Logits
Datuak
-0.48
WriteLiteral
-0.42
<eos>
-0.37
cara
-0.35
────
-0.34
...,
-0.34
RuleContext
-0.34
subsection
-0.34
gynhyrchwyd
-0.34
salu
-0.33
POSITIVE LOGITS
zijne
0.59
kasarigan
0.58
dezelve
0.54
횟
0.53
zoude
0.53
tambi
0.53
MessageTagHelper
0.52
nemlig
0.50
hæng
0.50
zelve
0.50
Activations Density 0.020%