INDEX
Explanations
phrases indicating some level of comparison or degree
parentheses used in sentences
New Auto-Interp
Negative Logits
arus
-0.75
oun
-0.71
monton
-0.69
akov
-0.68
oda
-0.67
icter
-0.66
resy
-0.65
APTER
-0.64
ibel
-0.64
igers
-0.62
POSITIVE LOGITS
Malf
0.72
interest
0.69
multiple
0.62
suff
0.59
ãĥİ
0.59
hearted
0.57
apologies
0.56
Leilan
0.56
inaction
0.55
stellar
0.54
Activations Density 0.089%