INDEX
Explanations
phrases related to communication behaviors and actions
phrases that suggest skepticism or critical thinking in various contexts
New Auto-Interp
Negative Logits
Myster
-0.72
Bulldogs
-0.72
NEY
-0.71
YN
-0.70
PB
-0.69
Interstitial
-0.66
Eth
-0.66
Nickel
-0.66
Padres
-0.65
Bubble
-0.65
POSITIVE LOGITS
lash
0.77
gow
0.77
Shaw
0.75
Machina
0.74
ilib
0.73
ruck
0.72
crus
0.72
rike
0.71
erers
0.71
uit
0.70
Activations Density 0.267%