INDEX
Explanations
time-related numerical data, such as percentages and specific years
occurrences of the word "the"
New Auto-Interp
Negative Logits
abilities
-0.82
craft
-0.74
usb
-0.71
witch
-0.71
quit
-0.69
MK
-0.68
agy
-0.67
lia
-0.65
bending
-0.64
nel
-0.63
POSITIVE LOGITS
latter
1.32
same
1.24
lowest
1.24
heaviest
1.19
highest
1.16
poorest
1.15
entire
1.10
remainder
1.09
average
1.09
smallest
1.08
Activations Density 0.650%