INDEX
Explanations
instances of phrases related to teasing or making fun of someone or something
expressions related to humor and playful critiques
New Auto-Interp
Negative Logits
Interstitial
-0.76
Decl
-0.74
ript
-0.73
Loading
-0.72
ifted
-0.71
ife
-0.68
oppable
-0.67
Quantity
-0.64
Aid
-0.64
ufact
-0.63
POSITIVE LOGITS
holes
1.12
fun
0.99
wink
0.91
peek
0.86
quizz
0.86
poking
0.84
poke
0.83
teasing
0.80
glimps
0.79
hole
0.78
Activations Density 0.090%