INDEX
Explanations
recurring phrases that denote ranking or classification, particularly in music and entertainment contexts
New Auto-Interp
Negative Logits
superf
-0.70
sewing
-0.65
leaflets
-0.65
ingredients
-0.65
detriment
-0.64
chem
-0.64
trooper
-0.64
dart
-0.63
laundry
-0.63
annexed
-0.63
POSITIVE LOGITS
itialized
1.10
selves
1.09
Them
0.98
Him
0.88
Himself
0.87
Us
0.86
slaught
0.84
oret
0.83
Than
0.80
Course
0.79
Activations Density 0.090%