INDEX
Explanations
adverbs indicating probability or expectation
New Auto-Interp
Negative Logits
iya
-0.75
bern
-0.75
issy
-0.72
ortmund
-0.70
enberg
-0.70
ilian
-0.69
aeus
-0.68
osi
-0.68
elight
-0.67
aan
-0.66
POSITIVE LOGITS
underest
0.80
underestimate
0.78
overest
0.75
underestimated
0.74
won
0.74
misunder
0.73
won
0.72
influenced
0.70
wont
0.69
exagger
0.69
Activations Density 0.423%