INDEX
Explanations
terms related to social structure and group dynamics
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.22
3:0.05
4:0.37
5:0.03
6:0.01
7:0.01
8:0.03
9:0.10
10:0.04
11:0.02
Negative Logits
ャ
-1.77
SN
-1.58
ás
-1.55
numbered
-1.53
nen
-1.52
otted
-1.38
batted
-1.37
ヴ
-1.35
uckland
-1.33
trailed
-1.31
POSITIVE LOGITS
Redditor
1.55
achine
1.49
fundamentalist
1.41
ioxide
1.39
degrade
1.31
RPG
1.30
sylv
1.29
ilege
1.28
Either
1.24
Actor
1.21
Activations Density 0.029%