INDEX
Explanations
attends to transformation-related tokens from class-related tokens
New Auto-Interp
Head Attr Weights
0:0.08
1:0.10
2:0.09
3:0.16
4:0.13
5:0.08
6:0.18
7:0.14
Negative Logits
opardy
-0.24
__(/*!
-0.24
erfolgre
-0.23
builtin
-0.23
pesanan
-0.23
logna
-0.22
lệnh
-0.22
UVWXYZ
-0.21
gesetz
-0.21
ansatte
-0.21
POSITIVE LOGITS
InjectAttribute
0.39
ThemeData
0.36
UnsafeEnabled
0.36
становника
0.35
محفوظة
0.35
MethodManager
0.34
PerformLayout
0.34
שוליים
0.31
afficheront
0.31
صوتيه
0.31
Activations Density 0.036%