INDEX
Explanations
references to categories or classifications within a structured format
New Auto-Interp
Negative Logits
ontent
-0.16
stp
-0.16
ège
-0.15
avic
-0.15
engin
-0.14
zion
-0.14
fac
-0.14
Raz
-0.14
Benn
-0.14
alam
-0.14
POSITIVE LOGITS
ArgsConstructor
0.17
classes
0.16
classes
0.16
Classes
0.16
Classes
0.15
Families
0.14
æľ
0.14
гÑĢо
0.14
(classes
0.14
PAIR
0.13
Activations Density 0.016%