INDEX
Explanations
the word "theme" followed by a specific context
references to themes in various contexts
New Auto-Interp
Negative Logits
UGE
-0.81
ards
-0.78
aq
-0.73
lishes
-0.73
dp
-0.73
ARD
-0.73
DERR
-0.72
ribes
-0.70
rets
-0.70
rations
-0.70
POSITIVE LOGITS
theme
1.11
themes
0.99
theme
0.96
ĸļ
0.87
Theme
0.84
park
0.79
ology
0.77
forest
0.75
ography
0.74
motif
0.73
Activations Density 0.020%