INDEX
Explanations
mentions of the name "Karen" and variations of "Karen" across different contexts
New Auto-Interp
Negative Logits
guard
-0.17
amina
-0.17
gear
-0.16
eon
-0.16
eric
-0.16
kick
-0.15
utow
-0.15
erior
-0.15
ilip
-0.15
cott
-0.15
POSITIVE LOGITS
ospace
0.18
aken
0.17
Demir
0.16
itan
0.15
åłĤ
0.15
Seit
0.15
317
0.14
egie
0.14
jit
0.14
ismo
0.14
Activations Density 0.006%