INDEX
Explanations
occurrences of the word "the."
New Auto-Interp
Negative Logits
gün
-0.15
stadt
-0.14
igid
-0.14
éĤ¦
-0.14
_area
-0.14
hoff
-0.14
à¥įà¤Łà¤°
-0.14
area
-0.14
qualification
-0.13
rest
-0.13
POSITIVE LOGITS
expense
0.41
expense
0.31
Expense
0.30
mercy
0.26
helm
0.26
urging
0.25
end
0.25
request
0.24
discretion
0.24
Expense
0.23
Activations Density 0.109%