INDEX
Explanations
mentions of a specific media outlet or publication
references to the word "Fam" or variations of it in different contexts
New Auto-Interp
Negative Logits
dress
-0.74
Wonderland
-0.71
skirts
-0.71
graduate
-0.69
overs
-0.68
dwarf
-0.68
tint
-0.68
pine
-0.66
İĭ
-0.66
DOWN
-0.65
POSITIVE LOGITS
ilial
1.69
iliar
1.53
ilitation
1.12
ilar
1.05
igl
1.04
uci
0.98
ilit
0.97
ilies
0.95
itsu
0.95
ili
0.95
Activations Density 0.010%