INDEX
Explanations
adjectives describing the intensity or quality of something
phrases indicating low quality or lack of effectiveness
New Auto-Interp
Negative Logits
YES
-0.74
ilts
-0.71
Leaks
-0.66
çļ
-0.65
pez
-0.65
probably
-0.63
Potential
-0.61
Ĥİ
-0.60
origin
-0.59
kamp
-0.57
POSITIVE LOGITS
anymore
1.14
glamorous
0.94
flashy
0.93
forgiving
0.90
bothered
0.86
bother
0.84
conducive
0.81
nor
0.81
appealing
0.80
pleasant
0.79
Activations Density 0.064%