INDEX
Explanations
specific references to people and their roles or attributes
New Auto-Interp
Negative Logits
oplevel
-0.16
rede
-0.15
onen
-0.14
')}}"></
-0.14
yg
-0.14
umba
-0.14
cigaret
-0.14
rado
-0.13
OCUMENT
-0.13
ournament
-0.13
POSITIVE LOGITS
equivalent
0.30
Equivalent
0.23
same
0.23
same
0.21
equivalents
0.21
occasional
0.19
SAME
0.19
remains
0.19
kinds
0.18
kind
0.18
Activations Density 0.581%