INDEX
Explanations
phrases describing compositions or structures
phrases indicating the composition or structure of a subject
New Auto-Interp
Negative Logits
hawks
-0.67
ilings
-0.67
bury
-0.65
Flying
-0.61
Karma
-0.60
WT
-0.60
Taj
-0.60
Dull
-0.58
fly
-0.58
Sil
-0.58
POSITIVE LOGITS
consist
0.89
galitarian
0.87
ģ«
0.85
consisted
0.81
comprised
0.80
solely
0.79
consists
0.78
encies
0.78
ĸļ
0.76
alion
0.76
Activations Density 0.010%