INDEX
Explanations
references to the concept of "Palestine" and related terms
New Auto-Interp
Negative Logits
ylko
-0.16
ingly
-0.16
nat
-0.15
arov
-0.15
outgoing
-0.15
ÙĬØ©
-0.15
rous
-0.15
£
-0.15
naments
-0.14
Burl
-0.14
POSITIVE LOGITS
stinian
0.29
ontology
0.28
olithic
0.27
ont
0.26
stin
0.25
ozo
0.19
Ale
0.19
hound
0.18
onto
0.18
ONT
0.17
Activations Density 0.006%