INDEX
Explanations
contexts related to socializing or spending time with others
New Auto-Interp
Negative Logits
ovi
-0.75
Ī
-0.73
OIL
-0.68
ELS
-0.68
士
-0.68
geries
-0.68
ela
-0.66
ĪĴ
-0.65
understatement
-0.65
ãĥ«
-0.64
POSITIVE LOGITS
buddies
0.81
partying
0.81
somewhere
0.79
sites
0.78
posts
0.78
Soda
0.78
ta
0.77
amongst
0.77
between
0.76
with
0.75
Activations Density 0.011%