INDEX
Explanations
phrases related to hobbies and interests
New Auto-Interp
Negative Logits
Dialog
-0.68
struct
-0.64
Prosecut
-0.62
Delivery
-0.61
administr
-0.60
arrang
-0.59
administration
-0.59
Islamic
-0.58
ictional
-0.57
Officers
-0.57
POSITIVE LOGITS
rejoice
1.82
beware
1.59
alike
1.35
flock
1.11
cringe
0.97
Beware
0.97
rave
0.96
appreciate
0.93
crave
0.91
everywhere
0.90
Activations Density 0.197%