INDEX
Explanations
proper nouns related to locations, organizations, and individuals
references to specific individuals and terms associated with them
New Auto-Interp
Negative Logits
Gi
-0.70
wrapper
-0.64
garg
-0.64
Gi
-0.63
ãĤĬ
-0.63
¶ħ
-0.62
Watts
-0.62
Tab
-0.61
filtered
-0.60
Ky
-0.60
POSITIVE LOGITS
endon
4.22
oxic
1.57
iens
1.54
endez
1.37
elong
1.28
igue
1.13
oxicity
1.12
vae
1.07
ibia
1.05
oxin
1.00
Activations Density 0.040%