INDEX
Negative Logits
.poi
-0.09
panic
-0.09
ussed
-0.09
flush
-0.09
:::::::::
-0.08
ä¸Ī
-0.08
proof
-0.08
AllowAnonymous
-0.08
sto
-0.08
straight
-0.08
POSITIVE LOGITS
warning
0.18
warnings
0.18
disclaimer
0.16
warn
0.14
Warning
0.13
warnings
0.13
cave
0.13
boiler
0.12
warn
0.11
Cave
0.11
Activations Density 0.057%