INDEX
Explanations
references to specific locations or cities
punctuation marks and specific formatting elements in the text
New Auto-Interp
Negative Logits
thood
-0.65
natureconservancy
-0.64
dn
-0.63
itary
-0.61
tons
-0.61
cabinets
-0.61
machines
-0.60
units
-0.60
scope
-0.59
products
-0.59
POSITIVE LOGITS
CITY
1.01
INC
0.93
MEN
0.87
ANE
0.87
GOODMAN
0.86
INC
0.85
IDER
0.85
ANK
0.83
VIDE
0.83
MEN
0.82
Activations Density 0.059%