INDEX
Explanations
phrases related to different viewpoints or attitudes
references to varying viewpoints or perspectives
New Auto-Interp
Negative Logits
liam
-0.75
avis
-0.73
cakes
-0.72
Interstitial
-0.71
nard
-0.68
ertodd
-0.67
vez
-0.67
nar
-0.67
ardy
-0.66
oval
-0.65
POSITIVE LOGITS
perspective
0.95
perspectives
0.91
viewpoint
0.89
Perspective
0.83
viewpoints
0.79
views
0.76
spection
0.76
view
0.75
Lens
0.71
lens
0.71
Activations Density 0.022%