INDEX
Explanations
references to the city of Paris in documents
references to the Paris Agreement and its context
New Auto-Interp
Negative Logits
ownt
-0.80
ramid
-0.79
verbs
-0.72
ULT
-0.71
Redd
-0.71
served
-0.70
FTWARE
-0.67
bish
-0.66
OWS
-0.64
serving
-0.64
POSITIVE LOGITS
Paris
1.03
furt
1.03
Hilton
1.02
ienne
0.97
Paris
0.87
Mé
0.79
Marse
0.79
neau
0.77
etta
0.77
Hebdo
0.77
Activations Density 0.010%