INDEX
Explanations
phrases related to perception or interpretation of the world
expressions of perception and personal beliefs
New Auto-Interp
Negative Logits
auga
-0.77
ARS
-0.66
entirety
-0.62
txt
-0.61
amiya
-0.61
fts
-0.60
icken
-0.60
umbnail
-0.59
gart
-0.58
ousing
-0.57
POSITIVE LOGITS
interacting
0.77
interacts
0.74
uate
0.71
interactions
0.70
communicating
0.69
differs
0.68
things
0.66
versus
0.66
interact
0.65
interpreting
0.65
Activations Density 0.213%