INDEX
Explanations
occurrences of phrases that denote or represent concepts of identity or belonging
New Auto-Interp
Head Attr Weights
0:0.03
1:0.03
2:0.12
3:0.09
4:0.12
5:0.04
6:0.19
7:0.02
8:0.09
9:0.11
10:0.08
11:0.04
Negative Logits
machinery
-1.57
perpend
-1.51
Klu
-1.48
��
-1.42
simulac
-1.40
inappropriately
-1.37
YPG
-1.37
Oy
-1.37
culprit
-1.37
Palest
-1.36
POSITIVE LOGITS
bern
2.01
reon
1.75
winter
1.61
urrection
1.56
rue
1.56
arten
1.55
Reviewer
1.51
ictionary
1.51
ilan
1.51
endor
1.48
Activations Density 0.052%