INDEX
Explanations
statements relating to health risks and medical conditions
New Auto-Interp
Negative Logits
ourke
-0.15
eya
-0.15
ez
-0.14
#echo
-0.14
anou
-0.14
åĺĽ
-0.14
åħ¥ãĤĬ
-0.14
ãģĿãģĹãģ¦
-0.13
orque
-0.13
же
-0.13
POSITIVE LOGITS
according
0.57
according
0.50
According
0.36
According
0.35
selon
0.34
según
0.33
ccording
0.28
accordance
0.27
æł¹æį®
0.27
secondo
0.25
Activations Density 0.058%