INDEX
Explanations
keywords related to weighing the advantages and disadvantages of a situation or decision
references to advantages and disadvantages in various contexts
New Auto-Interp
Negative Logits
faiths
-0.72
gments
-0.70
Prophe
-0.69
Galile
-0.66
pige
-0.63
prisoners
-0.62
ĭ
-0.62
journeys
-0.61
Bird
-0.61
Sutherland
-0.61
POSITIVE LOGITS
creen
1.33
ervative
1.14
ervatives
1.13
heet
1.09
aic
1.03
uits
1.03
linger
0.96
igl
0.94
cale
0.94
peed
0.93
Activations Density 0.199%