INDEX
Explanations
proper names or terms related to different countries and individuals
linguistic characters or symbols, particularly unusual or diacritically marked letters
New Auto-Interp
Negative Logits
IPS
-0.70
imer
-0.70
icles
-0.68
icle
-0.66
andel
-0.65
asket
-0.63
ochemical
-0.63
=-=-=-=-
-0.62
iffs
-0.62
quart
-0.62
POSITIVE LOGITS
cus
0.85
ternity
0.80
lisher
0.75
ð
0.73
mented
0.72
lette
0.72
cci
0.69
lio
0.68
pees
0.68
enced
0.67
Activations Density 0.008%