INDEX
Explanations
food-related words
specific food items and scientific terms related to health and technology
New Auto-Interp
Negative Logits
Pak
-0.73
RH
-0.69
KR
-0.68
NJ
-0.63
TW
-0.63
VID
-0.62
CB
-0.62
Nap
-0.61
NC
-0.61
NEC
-0.61
POSITIVE LOGITS
sers
0.75
olean
0.74
tml
0.74
igger
0.72
eers
0.68
osate
0.67
ernel
0.67
forcement
0.66
vertising
0.66
̶
0.66
Activations Density 0.255%