INDEX
Negative Logits
voiced
0.78
Abraham
0.78
Aa
0.77
siting
0.77
validator
0.75
voicing
0.75
Abraham
0.74
legitim
0.74
moneys
0.69
pound
0.69
POSITIVE LOGITS
operation
0.75
output
0.75
drive
0.70
operate
0.68
disk
0.66
dauer
0.66
Kelu
0.66
Operate
0.65
Operation
0.65
OUTPUT
0.65
Activations Density 0.001%