INDEX
Explanations
proper names (often related to famous individuals) or keywords such as "recipe", "cafe", "performance", and "horror film"
the presence of certain names and terms, particularly those related to Australian culture or context
New Auto-Interp
Negative Logits
lehem
-0.72
fines
-0.71
grandchildren
-0.63
aez
-0.62
setbacks
-0.62
cause
-0.62
ibles
-0.60
rosso
-0.60
acute
-0.60
uously
-0.59
POSITIVE LOGITS
velt
0.99
Rampage
0.78
heid
0.77
Janeiro
0.74
lund
0.70
kefeller
0.68
estern
0.68
Unix
0.68
restling
0.66
phant
0.65
Activations Density 0.128%