INDEX
Explanations
references to relationships and emotional manipulation
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.09
3:0.06
4:0.18
5:0.03
6:0.03
7:0.29
8:0.03
9:0.05
10:0.08
11:0.08
Negative Logits
iatus
-1.40
anomaly
-1.40
eeper
-1.39
forgiven
-1.39
Redditor
-1.38
DragonMagazine
-1.37
dividends
-1.35
fixed
-1.33
grievances
-1.31
emption
-1.31
POSITIVE LOGITS
Chinatown
1.42
cities
1.36
Gron
1.33
Bosnia
1.29
strongh
1.29
hubs
1.29
66666666
1.28
neighborhoods
1.28
towns
1.27
pher
1.25
Activations Density 0.000%