INDEX
Explanations
proper nouns or names
prominent names and entities involved in notable events
New Auto-Interp
Negative Logits
thereof
-0.76
.</
-0.75
}.
-0.71
.).
-0.71
..."
-0.70
".[
-0.70
$.
-0.69
]."
-0.68
etc
-0.67
)."
-0.66
POSITIVE LOGITS
meanwhile
0.86
ccording
0.63
spokesman
0.61
spokeswoman
0.60
Lavrov
0.60
odore
0.59
resa
0.58
surprisingly
0.57
':
0.56
Lauder
0.55
Activations Density 2.794%