INDEX
Explanations
phrases indicating a critical perspective on institutions or situations
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.07
3:0.07
4:0.07
5:0.08
6:0.08
7:0.09
8:0.07
9:0.08
10:0.08
11:0.08
Negative Logits
uber
-2.60
keyes
-2.60
76561
-2.52
ppel
-2.52
esp
-2.45
Blizz
-2.44
oo
-2.42
reau
-2.31
ャ
-2.29
usercontent
-2.29
POSITIVE LOGITS
incerity
2.54
���
2.39
uncovered
2.36
RIC
2.35
eteria
2.31
carts
2.31
istor
2.29
rudimentary
2.27
unta
2.26
schematic
2.25
Activations Density 0.000%