INDEX
Explanations
strings of letters or abbreviations in a specific format
symbolic references to entities or institutions associated with education
New Auto-Interp
Negative Logits
683
-0.80
ococ
-0.75
685
-0.73
asbestos
-0.73
690
-0.70
1870
-0.69
hots
-0.69
684
-0.67
Scarlett
-0.67
679
-0.66
POSITIVE LOGITS
D
1.33
DD
1.29
d
1.26
DP
1.26
ds
1.24
dt
1.22
DS
1.22
Ds
1.21
DR
1.21
dor
1.20
Activations Density 0.728%