INDEX
Explanations
phrases related to a small amount or degree of something
New Auto-Interp
Negative Logits
inarily
-0.79
itures
-0.75
idth
-0.74
ocity
-0.69
iership
-0.69
æ©
-0.67
ovies
-0.67
apons
-0.67
velt
-0.66
endars
-0.66
POSITIVE LOGITS
umen
0.87
terness
0.80
bit
0.76
ener
0.73
prick
0.70
aspirin
0.68
wig
0.67
overboard
0.67
luck
0.65
mask
0.64
Activations Density 0.023%