INDEX
Explanations
references to having systems, policies, or guidelines in place
references to established policies or systems
New Auto-Interp
Negative Logits
initely
-0.83
jin
-0.76
yip
-0.74
cest
-0.73
incinn
-0.73
uddy
-0.70
zzy
-0.69
yssey
-0.69
ingly
-0.67
anche
-0.67
POSITIVE LOGITS
bos
0.90
antioxid
0.79
holders
0.76
ascript
0.71
Ĥİ
0.71
defences
0.69
holder
0.68
lie
0.66
å§«
0.64
,,,,,,,,
0.63
Activations Density 0.018%