INDEX
Explanations
words around "your" or possessives
New Auto-Interp
Negative Logits
Congress
0.41
buckwheat
0.40
হাট
0.40
antip
0.39
museum
0.39
bazaar
0.38
travel
0.38
kebab
0.38
confectionery
0.37
الج
0.37
POSITIVE LOGITS
quickly
0.40
responsabil
0.39
Seth
0.39
ĝi
0.38
önet
0.38
narr
0.37
ownt
0.36
('-0.36
closing
0.36
('<0.36
Activations Density 0.001%