INDEX
Explanations
references to hills or mountainous landscapes
New Auto-Interp
Negative Logits
kla
-0.17
s
-0.16
yne
-0.16
sak
-0.16
obra
-0.16
samp
-0.15
eos
-0.15
ooke
-0.14
olics
-0.14
zure
-0.14
POSITIVE LOGITS
side
0.52
iard
0.41
top
0.40
ides
0.32
crest
0.31
arious
0.27
ier
0.27
-top
0.22
endale
0.21
borough
0.21
Activations Density 0.010%