INDEX
Explanations
words related to types of food, particularly cheeses, and actions related to physical touch, such as groping
words related to cheese and incidents involving groping
New Auto-Interp
Negative Logits
Origin
-0.70
Roman
-0.67
Idaho
-0.67
Macedonia
-0.66
Hale
-0.66
ware
-0.66
Dakota
-0.65
Commonwealth
-0.64
Nile
-0.62
Yemeni
-0.61
POSITIVE LOGITS
es
1.44
ecake
1.10
ese
1.01
esse
1.00
esy
0.99
uers
0.95
ues
0.93
eki
0.92
eless
0.92
erness
0.91
Activations Density 0.031%