INDEX
Explanations
words related to providing support or evidence for a claim or action
phrases related to support or endorsement
New Auto-Interp
Negative Logits
anooga
-0.77
anny
-0.73
ities
-0.72
odor
-0.72
hester
-0.72
tein
-0.71
Osc
-0.67
itarian
-0.67
apeake
-0.67
awk
-0.67
POSITIVE LOGITS
swing
0.82
backing
0.76
backed
0.76
drive
0.74
abies
0.74
track
0.71
GROUND
0.70
backed
0.70
drops
0.68
country
0.67
Activations Density 0.016%