INDEX
Explanations
instances of reported speech or quotations
New Auto-Interp
Negative Logits
ainen
-0.18
åĪ»
-0.16
uters
-0.15
licit
-0.15
alam
-0.15
weets
-0.14
utter
-0.14
mili
-0.14
ocht
-0.14
hib
-0.14
POSITIVE LOGITS
igham
0.16
arella
0.15
rawer
0.15
kker
0.14
ULL
0.14
ieg
0.14
midt
0.14
strand
0.14
avr
0.14
YPE
0.14
Activations Density 0.050%