INDEX
Explanations
phrases related to abstract concepts or emotions
expressions related to perception or understanding of experiences
New Auto-Interp
Negative Logits
raphic
-0.75
sites
-0.71
astered
-0.70
annis
-0.69
helicop
-0.68
essee
-0.64
script
-0.63
skill
-0.62
silent
-0.62
aunting
-0.62
POSITIVE LOGITS
sense
0.98
terday
0.92
Sense
0.83
sense
0.81
ibly
0.80
omatic
0.78
uitive
0.76
ively
0.76
ible
0.75
ibility
0.74
Activations Density 0.014%