INDEX
Negative Logits
irre
0.42
ഉരു
0.41
infusion
0.39
mishaps
0.39
Irwin
0.39
iri
0.39
ুখে
0.38
ഊ
0.38
violations
0.37
REFERENCE
0.37
POSITIVE LOGITS
printed
1.23
printed
1.15
Printed
1.09
1.07
Printed
1.05
印刷
0.99
0.95
0.93
0.93
printing
0.90
Activations Density 0.002%