INDEX
Explanations
attends to tokens that indicate a specific property, function, or classification from tokens that define or reference that same property in a structured manner
New Auto-Interp
Head Attr Weights
0:0.07
1:0.11
2:0.08
3:0.15
4:0.11
5:0.07
6:0.21
7:0.16
Negative Logits
ülé
-0.28
notice
-0.27
notice
-0.25
ValueGenerated
-0.24
stat
-0.24
вики
-0.24
note
-0.23
expri
-0.23
mex
-0.23
ná
-0.23
POSITIVE LOGITS
tvguidetime
0.44
#+#
0.40
المعيارى
0.38
ंदीखरीदारी
0.37
帖最后由
0.35
المناصب
0.35
Geplaatst
0.35
Референце
0.33
صوتيه
0.32
viewDidLoad
0.32
Activations Density 0.000%