INDEX
Explanations
phrases indicating speculation or suggestions
statements making suggestions or indicating possibilities
New Auto-Interp
Negative Logits
ãĥīãĥ©
-0.78
arak
-0.76
à¦
-0.76
apest
-0.74
arez
-0.68
ocaust
-0.67
ãĤ´ãĥ³
-0.67
utch
-0.65
reth
-0.65
aughed
-0.65
POSITIVE LOGITS
whoever
0.98
perhaps
0.91
there
0.86
they
0.85
tensions
0.84
something
0.84
maybe
0.83
someone
0.81
these
0.79
somebody
0.78
Activations Density 0.237%