INDEX
Explanations
expressions indicating reaching a high or extreme level or taking drastic actions
phrases indicating limits or boundaries in actions or ideas
New Auto-Interp
Negative Logits
itu
-0.76
bed
-0.68
ivered
-0.65
Sovere
-0.65
ipation
-0.63
ixture
-0.63
Nep
-0.62
ukes
-0.61
stable
-0.59
ixtures
-0.59
POSITIVE LOGITS
WARD
0.89
sidx
0.82
overboard
0.81
derog
0.79
towards
0.75
unnoticed
0.74
irtual
0.74
toward
0.74
boldly
0.72
lengths
0.68
Activations Density 0.057%