INDEX
Explanations
references to research studies being reported
New Auto-Interp
Negative Logits
disciplina
-0.52
amada
-0.50
offent
-0.49
claimed
-0.48
tercero
-0.48
volen
-0.47
proof
-0.46
~
-0.46
[
-0.46
Canto
-0.46
POSITIVE LOGITS
―――――
0.85
pinulongan
0.81
ThroughAttribute
0.81
<=",
0.81
neſs
0.74
חיצוניים
0.73
Italijani
0.72
itſelf
0.72
estekak
0.71
ContentAsync
0.70
Activations Density 0.041%