INDEX
Explanations
phrases related to the concept of "pizza"
references to a particular type of food or dish
New Auto-Interp
Negative Logits
Polar
-0.68
appropriation
-0.68
NX
-0.62
fitness
-0.61
fulness
-0.60
Malays
-0.60
conspicuous
-0.59
Koreans
-0.59
Korea
-0.58
izable
-0.57
POSITIVE LOGITS
arella
1.44
etta
1.21
erella
1.09
olla
1.00
ucc
0.97
hou
0.96
zz
0.96
ella
0.94
azz
0.94
eret
0.92
Activations Density 0.027%