INDEX
Explanations
phrases indicating a possibility or likelihood
New Auto-Interp
Negative Logits
culosis
-0.73
anwhile
-0.70
rones
-0.68
ulously
-0.67
shall
-0.66
Mund
-0.64
ãĥ¼ãĥ«
-0.63
enaries
-0.61
inav
-0.61
guarantee
-0.60
POSITIVE LOGITS
able
1.09
tempted
1.05
forgiven
0.94
considered
0.94
swayed
0.91
construed
0.90
mistaken
0.88
regarded
0.88
acons
0.83
viewed
0.82
Activations Density 0.108%