INDEX
Explanations
specific single-letter and multi-letter abbreviations or initials
New Auto-Interp
Negative Logits
RELE
-0.71
é¾įå¥ij士
-0.71
Dire
-0.70
UP
-0.67
MH
-0.67
RELEASE
-0.66
ãĤ°
-0.63
FORM
-0.63
Sigma
-0.63
Bom
-0.62
POSITIVE LOGITS
immer
1.21
abs
1.11
orks
1.10
arma
1.05
ork
1.03
oker
1.02
arg
1.00
ogg
0.99
aps
0.98
ospital
0.98
Activations Density 0.104%