INDEX
Explanations
mentions of names starting with "Kat" followed by a combination of different characters
references to the name "Kat" and its variations
New Auto-Interp
Negative Logits
relations
-0.70
matter
-0.63
cill
-0.62
ENCE
-0.61
intertw
-0.60
Relations
-0.60
contraceptives
-0.59
ENSE
-0.59
GBT
-0.59
cffff
-0.59
POSITIVE LOGITS
rina
1.27
apult
1.08
usha
1.03
anas
1.02
apesh
1.02
mand
0.97
rine
0.96
Kat
0.93
kat
0.92
inka
0.92
Activations Density 0.027%