INDEX
Explanations
complex phrases or descriptions that involve a mixture of positive and negative attributes or actions
complex narrative structures and clever tactics
New Auto-Interp
Negative Logits
eph
-0.75
reports
-0.75
izons
-0.73
Views
-0.73
Libraries
-0.72
mbuds
-0.72
rane
-0.70
amples
-0.70
isters
-0.69
Areas
-0.68
POSITIVE LOGITS
ploy
1.58
prank
1.43
scheme
1.39
stunt
1.39
disguise
1.37
plot
1.33
deception
1.30
trick
1.29
miracle
1.27
plan
1.24
Activations Density 0.531%