INDEX
Explanations
mentions of items or actions related to edible products
references to religious texts and items
New Auto-Interp
Negative Logits
brance
-0.77
abases
-0.75
NATO
-0.73
AUT
-0.71
period
-0.69
atorium
-0.67
yss
-0.66
poll
-0.66
Mer
-0.64
alg
-0.63
POSITIVE LOGITS
ibles
1.48
poons
0.98
ible
0.90
IBLE
0.88
poon
0.79
ibility
0.78
irtual
0.78
uits
0.76
veter
0.75
ibly
0.74
Activations Density 0.005%