INDEX
Explanations
instances of the word "app" and related forms
New Auto-Interp
Negative Logits
äge
-0.20
adera
-0.18
eping
-0.18
ep
-0.15
cker
-0.15
esters
-0.15
idelity
-0.15
esti
-0.15
ctor
-0.14
etter
-0.14
POSITIVE LOGITS
alach
0.26
licable
0.26
rais
0.25
ropriate
0.24
lic
0.23
raised
0.23
ellation
0.23
ointed
0.22
ended
0.22
arent
0.21
Activations Density 0.017%