INDEX
Explanations
words related to health, medicine, and well-being
New Auto-Interp
Negative Logits
Brill
-0.74
ANG
-0.63
Murd
-0.61
Page
-0.61
Quantity
-0.61
ilibrium
-0.59
Frames
-0.59
Swing
-0.59
Sail
-0.58
Mansion
-0.58
POSITIVE LOGITS
merga
1.19
ocy
0.89
ofer
0.79
opers
0.77
bara
0.74
cies
0.73
ongyang
0.72
otin
0.72
ggle
0.72
ococ
0.72
Activations Density 0.016%