INDEX
Explanations
explicit references to food
Following qualifying words, like "sounding" or "be"
insincere or exaggerated language
New Auto-Interp
Negative Logits
tahui
-0.60
tikra
-0.53
pédie
-0.52
merve
-0.49
lära
-0.48
المعيارى
-0.47
nakalista
-0.46
OCCURRED
-0.46
rör
-0.45
räck
-0.45
POSITIVE LOGITS
exaggeration
0.98
cliché
0.95
exaggerating
0.91
cliche
0.91
clich
0.80
sounding
0.79
understatement
0.78
exaggerate
0.78
biased
0.78
exagger
0.78
Activations Density 0.279%