INDEX
Explanations
phrases related to enjoyment and fun activities
New Auto-Interp
Head Attr Weights
0:0.01
1:0.02
2:0.05
3:0.03
4:0.02
5:0.05
6:0.05
7:0.12
8:0.05
9:0.06
10:0.06
11:0.41
Negative Logits
ortium
-1.70
��
-1.51
pex
-1.49
defic
-1.49
dilig
-1.40
yrus
-1.33
ailability
-1.32
registry
-1.32
pta
-1.30
ascus
-1.28
POSITIVE LOGITS
Reviewer
1.55
trivia
1.47
playground
1.31
salsa
1.29
Monkey
1.26
scen
1.24
mire
1.24
Angelo
1.23
comedy
1.22
Lyndon
1.20
Activations Density 0.045%