INDEX
Explanations
comparisons between different entities or concepts
phrases related to contrasting situations or viewpoints
New Auto-Interp
Negative Logits
Windsor
-0.65
SPA
-0.61
highs
-0.61
doors
-0.59
WARD
-0.59
Waterloo
-0.57
Marketplace
-0.57
Astral
-0.57
foreigner
-0.57
zn
-0.56
POSITIVE LOGITS
fared
0.89
thri
0.88
denies
0.87
udes
0.81
owes
0.80
withdrew
0.80
retains
0.79
suffers
0.79
reacted
0.79
accommod
0.78
Activations Density 0.156%