INDEX
Explanations
various combinations of letters
abbreviations or acronyms
New Auto-Interp
Negative Logits
ÄŁ
-0.78
ct
-0.71
fal
-0.62
awatts
-0.62
roc
-0.62
hex
-0.62
iverpool
-0.62
cc
-0.60
bc
-0.60
ks
-0.60
POSITIVE LOGITS
ERY
1.26
ICAL
1.24
ISH
1.22
IFIED
1.22
INS
1.20
ING
1.19
ERS
1.19
ITS
1.19
ITION
1.19
ICLE
1.17
Activations Density 0.091%