INDEX
Explanations
mentions of specific substances or medical conditions, particularly caffeine-related terms
terms related to caffeine and its effects
New Auto-Interp
Negative Logits
Ĥª
-0.77
flies
-0.74
ships
-0.73
ship
-0.72
chool
-0.72
soever
-0.69
\<
-0.67
Turtles
-0.66
irt
-0.66
OTA
-0.64
POSITIVE LOGITS
Osw
0.87
inated
0.81
ishment
0.78
ting
0.78
rection
0.76
ining
0.75
ormal
0.74
ornings
0.72
zers
0.72
aser
0.71
Activations Density 0.053%