INDEX
Explanations
words related to social issues and statistics
New Auto-Interp
Head Attr Weights
0:0.06
1:0.02
2:0.05
3:0.15
4:0.36
5:0.04
6:0.02
7:0.02
8:0.07
9:0.07
10:0.05
11:0.05
Negative Logits
IPM
-1.62
"]=>
-1.59
playbook
-1.52
TPPStreamerBot
-1.50
herpes
-1.50
RELE
-1.50
◼
-1.47
resil
-1.44
aunders
-1.43
サーティワン
-1.40
POSITIVE LOGITS
others
1.90
Others
1.83
umber
1.66
Others
1.51
old
1.46
onne
1.45
amera
1.45
likewise
1.39
Tunnel
1.38
meanwhile
1.38
Activations Density 0.076%