INDEX
Explanations
comparisons between people or objects
similes or comparisons using the word "like."
New Auto-Interp
Negative Logits
ourse
-0.83
ESA
-0.81
alez
-0.78
arcity
-0.77
alt
-0.74
irtual
-0.73
inion
-0.72
Ö¼
-0.72
elt
-0.72
untu
-0.72
POSITIVE LOGITS
lihood
1.06
lier
1.01
liest
0.96
crap
0.82
gib
0.82
garbage
0.74
trash
0.73
lifeless
0.72
yours
0.68
Andromeda
0.66
Activations Density 0.048%