INDEX
Explanations
phrases expressing a lack of interest or indifference
phrases related to interest or disinterest in various topics or activities
New Auto-Interp
Negative Logits
byter
-0.69
hem
-0.65
ggles
-0.65
osen
-0.61
bearer
-0.60
orbit
-0.60
ascend
-0.60
opers
-0.59
ioxide
-0.58
ework
-0.58
POSITIVE LOGITS
ately
0.86
¬
0.85
inery
0.78
lessly
0.72
uates
0.71
ª
0.69
enza
0.68
¿½
0.68
uated
0.67
ulous
0.65
Activations Density 0.079%